Aap 
MA ECON. 


CANADIAN nai 
OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. X- NO. 1 
1958 


The inner plethysm of S-functions D. E. Littlewood 
Products and plethysms of characters with 

orthogonal, symplectic and symmetric 

groups D. E. Littlewood 
A special formula for the Lie character Robert L. Davis 
The representation type of algebras and 

subalgebras J. P. Jans 
Goursat’s theorem and the Zassenhaus lemma Joachim Lambek 
The term rank of a matrix H. J. Ryser 
On the Hasse-Minkowski invariant of the 

Kronecker — of matrices Manohar N. Vartak 
A family of difference sets R. G. Stanton and D. A. Sprott 
Network flow and systems of 

representatives L. R. Ford, Jr. and D. R. Fulkerson 
On the random disorientation 

of two cubes D. C. Handscomb 
Geodesic groups of minimal surfaces H. G. Helfenstein 
Spectral theory for a class of non-normal 

operators II Harry Gonshor 
On the convergence of mean values 

over lattices Wolfgang Schmidt 
A generalized Tauberian 

theorem F. R. Keogh and G. M. Petersen 
On the existence of the Burkill integral H. Kober 
On generalized averaging operators R. P. Boas, Jr. 
Mixed problems for linear systems of first order 

equations G. F. D. Duff 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 


by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, G. F. D. Duff, R. D. James, R. L. Jeffery, 
J..M. Maranda, G. de B. Robinson, H. Zassenhaus 


with the co-operation of 


A. D. Alexandrov, R. Brauer, W. P. Brown, D. B. DeLury, J. Dixmier, 
P. Hall, I. Halperin, P. Scherk, J. L. Synge, A. W. Tucker, 
W. J. Webber, M. Wyman 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, G. F. D. Duff, University of Toronto. Everything 
possible should be done to lighten the task of the reader; the notation 
and reference system should be carefully thought out. Every paper 
should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $8.00. This is reduced to $4.00 for individual members of 
recognized Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton College 
Dalhousie University Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Queen’s University Royal Military College 
St. Mary’s University University of Toronto 


National Research Council of Canada 
and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 











266 
re/¥ 


THE INNER PLETHYSM OF S-FUNCTIONS 
D. E. LITTLEWOOD 


1. Introduction. In a previous paper (2), the inner product of two 
S-functions {\}.{u} was defined for (A),(u) partitions of the same integer n. 
Briefly, the ordinary product {A} {u} of two S-functions corresponds to the 
analysis of the direct product of two corresponding representations of the 
full linear group, while the inner product {A}.{u} corresponds to the analysis 
of the direct product of two representations of the corresponding symmetric 
group. Thus, if 
1.1 Xp Xo = Do Baw Xp” 
then 

{A} {a} = 22 grrr}. 


This operation is commutative, 
1.2 {A}.fu} = {w}.fA}, 
and distributive with respect to addition, 
1.3 {A}.({a} + fe}) = fA}.fa} + fA}. fr}. 


Also the symbol g,,, which arises in this inner product is symmetric with 
respect to all three suffixes, so that 


1.4 Braue = Brow = Burs = Zuri = Zrrw = Zour- 


In the case of an inner product of two identical S-functions, {A}.{A} however, 
a further analysis is possible. The expression {\}.{A} corresponds to the direct 
product of two identical representations of the symmetric group, and such 
a direct product can be analysed into its symmetric and skew-symmetric 
constituents. Thus if M, is the matrix representing a symmetric group element 
S,, then the direct product M, X M, may be analysed and shown to be 
equivalent to the direct sum of the second induced matrix and the second 
compound matrix of M,, 


M,X M,* Mi" + Mi/"4 
where = denotes equivalence and + direct sum. 

The expression {A}-{A} may be analysed correspondingly into two parts 
which will be denoted respectively by {A} ©{2} and {A} O{1°}. The operation 
which is denoted by the symbol © is called inner plethysm, and in the general 
case is defined as follows. 


Received February 8, 1957. 











2 D. E. LITTLEWOOD 


Let M, be a matrix representation of the symmetric group on n symbols 
corresponding to the group element S; and let the spur of M, be x™(S,). 
Then the invariant matrix of M, corresponding to the partition (u), namely, 
M,;'*!, is also a matrix representation of the symmetric group, and its spur is 
a compound character, say 


De Grew x!" (Se). 
Definition 
[MOlu} = 2) Gurlr}. 

This operation is not commutative. In fact the partitions (A) and (v) must 
be partitions of the same integer m, while the partition (u) is not so restricted. 

The laws governing the combinaticn of this operation with other operations 
concerning S-functions are in many respects similar to those relating to the 
ordinary plethysm of S-functions, {A} @ {u}, except that inner product takes 
the place of ordinary product. 

Ordinary plethysm satisfies the following two laws (1, 240) 


{A} @ (A + B) = {A} @A + {dj OB, 

{4} @ (AB) = (fA} @ A) ({A} @ B). 
If M is an invariant matrix corresponding to {A} of a matrix N, then the 
invariant matrix of M corresponding to {u} + {v} is by definition equivalent 
to the direct sum of invariant matrices corresponding to {u} and {v} res- 


pectively. Expressing the result in terms of S-functions of the latent roots of 
N this gives 


{A} @ (fu} + fe}) = fA} @ {a} + {A} @ fy}. 


If now M is taken as a representation of the symmetric group corresponding 
to the partition (A), the result becomes 


{A} © ({u} + f}) = fA} © fw} + fA} © fr}. 


Similarly, considering an invariant matrix of M which is the direct product 
of invariant matrices corresponding to {yu}, {v} respectively, the result follows, 
in the first case 


{A} @ ({u} {o}) = ({A} @ {u}) ({A} @ {r}), 


and in the second case, when M is a representation of a symmetric group, the 
product on the right becoming an inner product, 


{A} © ({u} fv}) = CfA} © {u}).({A} © {}). 
Obviously, the simple S-functions {yu}, {v} may be replaced by linear com- 
binations of S-functions, and thus 
1.5 {A} © (A + B) = {A} OA+ {A} OB, 
1.6 {4} O (AB) = ({A} © A).({A} © B). 








). 
y; 


ng 


ict 
rs, 


he 





PLETHYSM OF S-FUNCTIONS 3 


The same procedure is available to find the analogue of the equation (1, 290) 


(A + B) @ {A} = D> TynlA @ {u}) (B® {r}). 


With the nomenclature considered above, this equation is obtained by taking 
a direct sum of two matrices correspondong to A, B respectively. But if 
these are representations of the symmetric group, the result is 


1.7 (A + B)O{A} = DL Tn (AO {u}).(BOfr}). 
For a product on the left the known result is 
(AB) @ {A} = Do gun (A @ {u}) (B® {v}) 


where 
(w) _ a) 
 — D gun x 


which is equivalent to 
tu} .fo} = Do gual}. 
To prove this the matrix M is considered as a direct product of two matrices. 


But if it is the direct product of two representations of the symmetric group, 
the same reasoning gives 


18 (AB)O{A} = DL gum (AO {u}).(BO {v}). 
One result is rather remarkable. Ordinary plethysm is associative, 
(A®@B)@C=AQ(BOQC). 


If now the basic matrix is taken as a representation of the symmetric group, 
the two plethysms on the left become inner plethysms. But on the right only 
the first of the two plethysm signs is changed to inner plethysm. Hence 


1.9 (AOB)OC=AO (BOC). 


In the place of the associative law the inner plethysm sign is converted to 
ordinary plethysm. 

The evaluation of an expansion of an inner plethysm is not at all easy. 
A method will be given here, however, for obtaining the expansion of 
{n — 1,1} © {A}. For this purpose formulae will be given for evaluating 
{n — 1,1} © {r} and {m — 1,1} © {1’}. From either of these results by the 
aid of inner products the general expansion {m — 1,1} © {A} is obtainable. 

Some progress may be made by the use of equation 1.9 to evaluate the 
more general expression {A} © {yu}. However, a much more powerful procedure 
is obtained by a method of expanding ({A} {u}) © {r}. 


2. Invariant matrices of permutation matrices. The symmetric group 
on m symbols has a representation consisting of permutation matrices of 
degree n. The spur of the representation is x” + x®~'”, and it is convenient 
to associate it with the expression 


{m — 1} {1} = {m} + {nm — 1,1}. 











4 D. E. LITTLEWOOD 


The spurs of the rth induced matrices of these permutation matrices is 

a sum of symmetric group characters which is associated with the expression 
({m — 1} {1}) © {r}. 

The rth induced matrix is obtained by taking any set of r rows with possible 
repetitions, and a set of r columns, again with possible repetitions, and taking 
the permanent of the r-rowed square matrix so obtained, allowing a numerical 
factor 1/1! for an i-fold repeated column. The matrix obtained from such 
permanents for all choices of r rows and of r columns is the rth induced matrix. 

Since each row of a permutation matrix has only one non-zero element, that 
element being unity, for any choice of r rows there is a unique set of r columns 
which gives a non-zero permanent, and the permanent in this case is unity. 
Thus the induced matrix is also a permutation matrix. Further, if the set of 
r rows has repetitions the set of r columns must have exactly corresponding 
repetitions. 

Let the row of the induced matrix correspond to 


a; 


a as*...a¢*, a: t+a2+...+4;=7, 
this indicating that the first row is repeated a, times, the second a, times, and 
so on. This term will be permuted by the symmetric group permutations into 
every expression which appears in the monomial symmetric function 


ai a2 a 
» a, ade EE ae 


The induced matrix can thus be analysed into a direct sum of permutation 
matrices each corresponding to a monomial symmetric function. These will 
be considered individually. 

Consider first > a,". The permutation matrix is simply the matrix of 
permutation of the a,’s and thus corresponds to 


{m — 1} {1}. 

Consider next >> a:"~'a2. The expression a;’~'az is unaltered by all per- 
mutations on the other  — 2 symbols but is changed by any permutation 
which involves either a; or a2. The corresponding expression is 

{m — 2} {1} {1}. 

Assuming r > 4, >> a1’~*a,” yields the same result. 

Consider next > a:"~*asa;. The term a;’~*asa; is unaltered by any permu- 
tation of the other  — 3 symbols and also by the interchange of a2 and a3. 
The corresponding expression is thus 

{m — 3} {2} {1}. 


The result in the general case may now be inferred. Consider the term 


At Ae hi 
ai ade oo « Cy 


where (Ax, A2,...,A,) is any partition of r. Let the conjugate partition be 
(m1, M2, ---» My), SO that wi = 4. 





Sut 


the 


H. 





PLETHYSM OF S-FUNCTIONS 5 
¢ is There are » — i = nm — yw; symbols which do not enter into the expression 
_ a! a?...a?* and the term is unaltered by the symmetric group of per- 

mutations on these symbols. There are uw; — ye indices each equal to A,, and 
ble again the symmetric group of permutations on these symbols is allowable. 
ing There is another set of wu: — ws equal indices, and so on. The corresponding 
cal expression is 
ich {m — wai} {ui — we} [ue — ws}... {aya — p;} {wy}. 
_ Hence 
iat 
ins THEOREM I. 
st ({m — 1} {1}) © fr} = DO fm — ws} fur — wa}... Lapa — my} ay} 
of } +++ (Bz j iMy 
ing summed for all partitions (1, uo, ..., @,) Of r. 
The following examples illustrate. Since 
he = > a,” + . 12 
then 
nd ({m — 1} {1}) © {2} = {m — 1} {1} + {m — 2} {2} 
ito = 2{n} + 2{m — 1,1} + {nm — 2,2}. 
If (A) = (Aa, Ae, ...,Ax) is a partition of s it is convenient to denote 
{m — s,A1, Az, ~~~, Ae} by [Aa, Ao, ..., Axe] = [A]. The above result may then 
ion be written 
vill ({0] + [1]) © {2} = 2[0] + 2[1] + [2]. 
P Again, corresponding to 
o 
hs = > a + DS avar+ D> aan; 
({0}] + [1]) © {3} = {am — 1} {1} + {a — 2} {1} {1} + {m — 3} {3} 
er- = 3{0] + 4[1] + 2[2] + [1°] + [3}. 
ion Similarly 
({0] + [1]) © {4} = 5{0] + 7[1] + 5[2] + 2[1°] + 2[3] + [21] + [4]. 
Using equation (7), 
1u- ((0} + [1]) © {7} = [1] © ([r] + [yr — 1] +... + [1] + [0)). 
_ Hence 
(1] © tr} = ((0] + [1]) © ({r} — {r — 1}). 
Thus 
[1] © {0} = [0], . 
(1) © {1} = [1], 
[1] © {2} = [0] + [1] + [2], 
be [1] © {3} = [0] + 2[1] + [2] + [1*] + (3), 
[1] © {4} = 2[0] + 3[1] + 3[2] + [17] + [3] + [21] + [4]. 














6 D. E. LITTLWEOOD 


The compound matrices are in some respects simpler. Consider the rth 
compound matrix of the permutation matrix which permutes the symbols 
Gi, M2, ..+ » An. 
Any set of r distinct rows may be chosen. This gives, for a non-zero deter- 
minant, a unique set of r columns. The determinant of the minor is + 1. If it 
were always + 1 the result would be a permutation matrix, but this per- 


mutation matrix must be modified to allow for possible factors of the form 
— 1. 

The element in the compound matrix will be unaltered by any permutation 
of the remaining » — r symbols, but for a permutation of the r symbols there 
will be a change in sign if the permutation is negative. Thus there is involved 
the symmetric group on a set of m — r symbols and the negative symmetric 
group on the set of r symbols. The corresponding expression is 


({m — 1} {1}) © {17} = {wm — 4} {1} = [mn —r + 1,17") + {un — 7, 1’}. 
Again from equation (7), 
({0] + [1]) © {17} = [1] © ({1"} + {1™"}). 
Hence 
THEOREM II. 
{a — 1,1} © {1"} = {n —7, 1°}. 


To illustrate, the expression for [1] © {3} will be obtained by the use of 
Theorem II. 


Since 

{3} = {13> + {1°} — 2f1} {17}, 

therefore 
[1] © {3} = [1] © ({1}* + {1°} — 2{1} {17}) 
= [1].[1].[1] + [1°] — 2[1).[1’). 

Since 

(1].[1] = [0] + [1] + [2] + [1°], 
therefore 


[1] © {3} = [1°] + [1]. ((0] + [1] + [2] — (1°) 
(1°) + [1] + [0] + [2] + [17] + [3] — [1°] 
= [0] + 2[1] + [2] + [1°] + [3], 
which conforms with the result obtained by the use of Theorem I. 
By evaluating inner products in this way the general expansion {m—1,1}©{X} 
may be evaluated either from Theorem I or from Theorem II. 
A generalization of the method to evaluate {A} © {u} in the general case 


is not apparent. However, some progress may be made by the use of equation 
1.9. 


ll 





ther 


E 
of [ 


the 


({0 











~I 


PLETHYSM OF S-FUNCTIONS 


Thus, since 
[1] © {1°} = [1°] 
therefore 


[17] © {17} = ({1] © {17}) © {17} = [1] © ({17} @ {17}) 
[1] © {217} = [1] © ({1} {17} — {14) 
= [1]. (1*] — [1*] 
= [17] + [21] + [1°] + [21°]. 
Evaluation of [17] © {A} may be performed by this method. The evaluation 


of [2] © {A} is a little more complicated. 
Thus, since 


ll 


[1] © {2} = [0] + [1] + [2] 
therefore 


((0] + [1] + [2]) © {17} = [1] © ({2} @ {17}) 
= [1] © {31}, 


which may be evaluated as described above to give 
2(1] + 2[2] + 3[(1?] + [3] + 2[21] + [1°] + [31]. 
But also 


((0) + [1] + (2]) © {17} = ((0) + [1]) © {17} + ((0] + (1) . [2] + [2]0{14} 


2(1) + 2[2] + 2[17] + [3] + [21] + [2] © {1%}. 


Hence 
[2] © {17} = [1°] + [21] + [1*] + [31]. 


Theoretically such procedures will allow the evaluation of {A} © {u} in 
every case, but in practice the calculation may become extremely involved. 

The general method, however, is applicable to any permutation represent- 
ation, although the only one so far considered has been the permutation of the 
actual symbols, which corresponds to the expression {m — 1} {1}. 

Consider the subgroup corresponding to {nm — r} {r}. The corresponding 
permutation representation permutes the sets of r symbols such as a, a,...,a,. 
The second induced matrix of the representation will correspond to the 
expression 


({m — r} tr}) © {2}. 


Taking a pair of rows from the permutation matrix, these will correspond 
to two sets of r symbols a, a2,...,a, and a’, a2’,...,a,’. If there are r—i 
symbols common to the two sets, then also for the corresponding columns 
which give a non-zero element the two sets of r symbols will have exactly 
r — i symbols in common. The second induced matrix is therefore reducible, 
the constituent matrices corresponding to the different values of i. It is 
therefore pertinent to consider these individually. 











8 D. E. LITTLEWOOD 


Consider the two sets of symbols 


OL ,Ol2, « «sy Mi get, ~~ + » Ay ’ a}, ,* eee 0 41, see y Ay 
the pair of sets is invariant for any permutation confined to the set a1, ... , a, 
or to the set a;’,...,a,, or to the set a,,1,...,a,. It is also invariant for the 


interchange of the first two sets of i symbols. The corresponding expression 
is therefore 


({t} @ {2}) {r — a} {n — 1 — 4}. 
Hence 


THEOREM III. 
({m — r} br}) © (2) = DL (i) @ (2)) fr — i} fm —r — Fi}. 


The second compound matrix gives similarly 


THEOREM IV. 


r 


({m —r} {r}) © {17} = DO (fa} @ {17}) {r — a} [fn — — 4}. 


i=1 


The case i = 0 can be omitted since {0} @ {17} = 0. 


3. Extension to products of S-functions. These theorems can be 
generalized. There is a matrix representation of this subgroup which is the 
direct product of a representation of the symmetric group on the first r symbols 
corresponding to the partition (A) of r, and a representation of the symmetric 
group on the other m — r symbols corresponding to the partition (u) of m — r. 
The corresponding representation of the symmetric group on m symbols is 
constructed by taking the direct sum of m!/r! (mn — r)! such representations 
corresponding to the conjugate subgroups and allowing for the permutation 
of these representations as well as the matrix products in the various represen- 
tations. The representation’ so obtained, of degree f® f™ m!/r! (nm — r)! 
corresponds to the expression 


{A} {a}. 
The second induced matrix of this representation corresponds to 
({A} {u}) © {2}. 


It is convenient to consider at the same time the second compound matrix, 
corresponding to 


({A} {u}) © {17} 
and the following Theorem is obtained. 


1See (6) where this representation is extensively used. 








()) 








PLETHYSM OF S-FUNCTIONS 9 


THEOREM V. [If (v) is a partition of 2, 


({A}{u}) © fo} = Lo geeky Tam Paoe(fa} © {€})({8} © fm} )(((6} .16})@ (33) 
+ Lo 4(Tem Vion — Tam Toon) (for} - for} ({8} £83) (fo) 161) (1 6} . (6}) 
+ Do Pam Taro Voom Vorern( foe}. fa’}) (18). £83) (1) . £0} (1 67} . (6"}). 

In this enunciation Ian is the coefficient of {A} in {a}{6}, gee, is the 
coefficient of {v} in {&} . {nm} . {f} so that (&), (m), (¢) must be partitions of 2. 
Instead of taking (A) and (yz) as partitions of r and m — r it is rather simpler 
to take (A) as a partition of m and (yz) as a partition of m. If (@) is a partition 
of i, then so is (@), and also (6’) and (¢’) in the last summation. Then (a), 
(a’) are partitions of m — i, and (8), (8’) partitions of m — i. In the last two 
summations, if (a) and (a’), (8) and (8’), (@) and (6’), (¢) and (¢’) are inter- 
changed an identical term is obtained, but only one of the two identical terms 
is included in the summation. The cases for which (a) = (a’), (8) = (8’), 
(6) = (0), (6) = (¢’) are excluded from the last summation. 

The value of ({A} {u}) © {2} or ({A} {u}) © {1%} is obtained by summing 
the permanents or determinants of the two rowed principal minors of the 
representation corresponding to {A} {uw}. Consider any pair of rows. Corres- 
ponding to (A) there are, associated with these two rows, two sets of m symbols. 
Suppose that m — 1 symbols are common to the two sets. For non-zero 
results permutations must be confined to those which permute these m — i 
symbols, permute the » — 1 symbols which occur in neither set, permute each 
set of 1 symbols which occur in one set only, or interchange these two sets of 
1 symbols. 

Proceeding from the symmetric group on m symbols to the subgroup which 
permutes separately the m — i and the i symbols, the representation corres- 
ponding to {A} reduces to various representations corresponding to {a} {6} 
where (a) is a partition of m — i and (@) is a partition of 7. Such a represent- 
ation occurs with frequency Is. We are thus led to a term >> Tam {a} {6}. 
Similarly for the second row there is a term >> Ta-ea{a’} {6’}. 

To a permutation among the m — i symbols there corresponds the direct 
product of the representations corresponding to each of the two rows. This 
leads to 

> Taon Tarorn( {a} . fa’}) {0} { o}. 

Combining this with an equivalent result in relation to the partition (yz) 
and remembering that for a set of i symbols, for one row the representation 
corresponds to (@), being obtained from the representation corresponding to 
(A), while for the other row the representation corresponds to some (¢@), 
obtained from the representation corresponding to (yu), it is clear that the final 
result is that given in the last summation in the enunciation of Theorem V. 

But allowance must be made for the possible interchange of the two sets of 7 
symbols. This interchange is only possible when (a) = (a’), (8) = (8’), (@) = (6), 
(¢) = (¢’). In this case {a} .{a} is replaced by {a} © {2} + {a} O{1°}, of which 











10 D. E. LITTLEWOOD 


the first involves a plus, the second a minus sign for the interchange. If Tas 
is 0 or 1 in every case, this is sufficient. 

Similarly {8}.{8} is replaced by {8} © {2} + {8} © {1°}. But the ordinary 
product ({@}.{0}) ({@}.{@}) is replaced by ({}.{0} @ {2} + ({o}.{@}) © {1%}. 

Of the three alternative signs, for ({A}{u}) © {2} either all must be positive 
or exactly two negative, and for ({A} {u}) © {17} either one or three must be 
negative. This accounts for the coefficient 7¢pr,. 

Special consideration must be given to the case when Tae Igy, > 1. If, 
for example, T4e, > 1 then terms corresponding to {a} {@} occur more than 
once. In the detailed analysis these will correspond to different Young Tableaux. 
But the interchange of (a) with (a’) etc., is only allowable if the same Young 
Tableau is concerned in each case. The reduction from products to plethysms 
can only occur in Tas Iss, cases. There remain Tan Ta, — Toor Msg, cases 
where in the two rows there is a difference in the corresponding Young 
Tableaux. These cases remain as ordinary or inner products, but with only 
half the frequency. This completes the proof of the Theorem. 


As an example consider ({317} {1}) © {2}. For {a} = {317} the terms are 
({312} © {2}) ({1} © {2}) + ({317} © {17}) ({1} © {1%}). 
The second term is zero. To evaluate the first, note that 


{317} © {2} = ({41} © {17}) © {2} = {41} © ({17} @ {2}) 
{41} © ({27} + {14}) 

= {317}.{317} — {41}.{21°} + {1%} 

= {5} + {41} + 2{32} + {271} + {1°}. 


The term is thus 
{6} + 2{51} + 3{42} + {417} + 2{37} + 3{321} + {2°} + {2717} + {21*} + {1}. 
For {a} = {a’} = {31} the terms are 
({31} © {2}) {2} + ({31} © {1*}) {17} 
= ({4} + {31} + {27}) {2} + {217} {17} 


= {6} + 2{51} + 3{42} + {417} + {3%} 
4+ 3{321} + 2{2%} + {319} + {2712} 4+ (214). 


The case {a} = {a’! = {21*} gives 
({217} © {2}) {2} + ({217} © {17}) {14} 


which yields an identical result. 
The only other case is {a} = {31}, {a’} = {21}. The rows are not now 
interchangeable and the terms are 


({31} -{217}) {1} {1} = ({31} + {27} + {217} + {1*}) {1} (1) 
= {51} + 3{42} + 3{417} + 2{3} + 6{321} + 2/2% 
+ 4{31*} + 4{2717} + 3{21*} + {1%}. 





th 


he 





12. 


Ww 








PLETHYSM OF S-FUNCTIONS ll 


Hence 


({317} {1}) © {2} = 3{6} + 7{51} + 12{42} + 6{417} + 6{3*} + 15{321} 
+ 7{2*} + 6{31°} + 7{2717} + 6{214} + 2{1%}. 


If this is expressed as 


({31*} {1}) © {2} = ({417} + {31%} + {321}) © {2} 
= {417} © {2} + {31°} © {2} + {417}.{31*} + 
{417}.{321} + {31°}.{321} + {321} © {2}, 


the evaluation of the first five terms and substitution leads to 


{321} © {2} = {6} + 2(51} + 3{42} + (414) + (3%) + 3321} 
+ 2(2*} + (319) + (2517) + (214 + {14}. 


4. Plethysm and inner plethysm. In the case when (A) = (yz), the left 
hand side of the equation of Theorem V becomes 


({A} {A}) © fv} = ({A} @ {2} + {A} @ {14}) © {vy} 
= ({A} ® {2}) © tv} + ({A} @ {17}) © {vr} + 
({A} @ {2}).({A} @ {1%}). 


By a careful analysis of the terms which appear on the right of the equation 
it is possible to separate these so as to give expansions for the expressions 
({A} @ {u}) © {v} when (uz), (v) are partitions of 2. 

It is more convenient to start with the expansion 


4.1 [{A}{A}) (EAP EAP] = DO Pam Daron Poon Veron ( for}. (a’})({8} .{8"}) 


({o} .{0}) (1 o"} . (67). 
Consider the set of 8 symbols 


4.2 «,0,,8,8", ¢’, 0’, a’. 


Let T be an operation which reverses the order of these 8 symbols, and S an 
operation which permutes them cyclically, moving them two steps at a time, so 
that, for example, a — ¢ — #’ etc. Then 


P=aT=S, ST = TS*. 


The operations S,T generate a group G of order 8, and each term on the right 
of 4.1 is converted into an equal term by the operations of this group. 

In general the terms on the right of 4.1 are equal in sets of 8 according to 
the operations of this group. But it may happen that, by reason of certain 
equalities among the 8 symbols, certain operations of G convert the term into 
the same term. If the operations which leave the term invariant form a sub- 
group I of G of order y, the term is repeated only 8/7 times. When the term 
is repeated 8 times just one of the 8 terms is included in ({A} @ {u}) © {r} 
for any partitions (x), (v) of 2. But when 7 > 1, the term must be further re- 














12 D. E. LITTLEWOOD 


duced by converting products into plethysms or inner products into inner 
plethysms. 

The result will be enunciated as a Theorem. The full result is rather com- 
plicated owing to the need to provide for all the exceptional cases that can 
arise, in order to give complete generality. It is thought worth while to give 
this result in its full generality since it is a basic result concerning the inter- 
action of operations with S-functions. To save space, however, proofs will 
be omitted. Because of the multiplicity of special cases such proof becomes 
intricate and involved. The principle involved is straightforward, representing 
only the operation of the appropriate symmetrizing operator on the general 
term. 


THEOREM VI. Jf (A), (u) are partitions of 2, the expansion of ({A} ®@ {u})Of{r} 
ts obtained from the right hand side of equation 4.1 by selecting one from each set 
of 8/y equal terms which appear, and operating on this term with the appropriate 
symmetrizing operator. 


The appropriate symmetrizing operators will now be described, and the 
effects on the term listed for all possible subgroups. 

For simplicity, suppose first that I’. and all similar coefficients are either 
0 or 1. 

(1) fT 

(2) lf © 


I, the term is taken unchanged. 
I, T the 8 symbols may be taken as 


a, 6, ?, B, B, ?, A, a. 


The term is replaced by 


4.3 De (far} © {E})({8} © fn} )[({0} -}) @ CLE} - fn} fr}, 


summed for all partitions (£), () of 2. 
(3) If T = J, S* the 8 symbols can be taken as 


a, 6, d, B, a, 6, ?, B 
The term is replaced by 


4.4 DX [(fa}.{8}) @ {EV I({} .{¢}) @ (&}]. 


The summation in this and other cases is for (£) = (2), (17) and when it 
appears, for (n) = (2), (1°). 
(4) If © = J, TS the symbols can be expressed as 


a, 0, b, o, 0, a, a’, a’. 
The term is replaced by 


4.5 DX [(fa}.fo"}) @ (EV ((0} .f6}) @ CLE}. fu}. 
(5) If © = J, S, S*, S* the symbols can be expressed as 


a, 6, a, 0, a, 0, a, 8. 








The 















PLETHYSM OF S-FUNCTIONS 


The term is replaced by 
4.6 DX (fa) .{0}) @ (€}] @ (fu). fr}). 
(6) If © = J, S*, T, TS* the symbols can be expressed as 
a, 0, 0, a, a, 6, 0, a. 
The term is replaced by 
4.7 [(fa} © {2}) @ {v} + (fa} © {1°}) @ fv} ]}[({0} © {2}) @ fr} 
+ ({@} © {1°}) @ {>}] 
+ [(fa} © {2}) @ (fr}.{1°}) 
+ (fa} © {1°}) @ (fv}.{17})] (10) © (2})({e} © (1*}) 
+ [({0} © {2}) @ ({»}.{1°}) 
+ ({0} © (14) @ ({r}.{14])(fa} © (2})(fa} © {1*}) 
+ (fa} © {2})(fa} © {1°})({} © {2})({6} © {1°}). 
(7) If © = J, S*, TS, ST the symbols can be expressed as 
a, a, 8, B, a, a, B, B. 
The term is replaced by 


48 Dd (fa}.{8}) @ {&} @ {u}. 


(8) Lastly when T = G so that all 8 symbols represent the same partition 
(a), the term is replaced by 


4.9 ({a} © {2}) @ fu} @ {fr} + (fa} © {17}) @ {wu} @ {v} + [(fa} © {1*}) 
@ {u}] ((fa} © {2}) @ {u}] + ((fe} © {2}) Cla} © {1%})] @ tv} + 
[({a} © {2}) @ {a} + (fa} © {17}) @ {a}] (le} © {2}) (fej © {1%}). 


where {ff} = {u}-{1*}. 

The cases when I, or a similar coefficient is > 1 must be considered for 
the sake of completeness, although it seems really to be of academic im- 
portance only. The simplest case occurs with 


({321} @ {2}) © {2} 


which involves representations of the symmetric group on 12 symbols. In 
the application of the theorem several hundreds of different cases arise. 
The required changes occur for subgroups [ when the coefficient 


Tao Tarorn Tagen Teron 


on the right hand side of 4.1 is such that some of the factors such as Ig¢ 
are equal and interchangeable by the operations of [T. Thus if (a) = (a’), 
(0) = (6) so that Tae, = Tae, then in the modified expression the coefficient 
is taken as Tan rather than I'3a. The remaining terms are halved in number, 















14 D. E. LITTLEWOOD 


that is, the coefficient is taken as $(T'n — Tan), and treated as if they 
belonged to the subgroup which does not interchange Tae, and T.-9,. In the 
case when Tam, Tavern, Uaou, I'a-o, are all identical, say when 


(a) = (0) = (a’) = (6) = (8) = (¢) = (8’) = (’) 


the completely modified form is taken with coefficient Tae. Each of 3 
subgroups which only interchange the coefficients in pairs is taken with 
coefficient $(Tin — Tan). Each of two subgroups which make one inter- 
change of coefficients is taken with coefficient }(Tén — Tia). Finally the 
completely unmodified term as in (4.1) is taken with coefficient 


(Tin — 2Tan — Tan + 2Tan)- 


The reason for this is that, when ['.9, > 1 the [a9 distinct terms correspond 
to different Young Tableaux, and interchange is only allowable if the same 
Young Tableau is involved. For different Young Tableaux, the interchange 
not being allowable, one half only of the terms are selected. 

Although the application of the Theorem for large partitions (A) can be 
so complicated as to be quite beyond calculation, nevertheless for partitions 
(A) of m up to, say, m = 5 application can be quite simple. Thus consider 
({3} @ {17}) © {2}. 

In every case (a) = (a’) = (8) = (6’), (0) = (&) = (¢) = (¢’). In every 
case T = J, S*, T, TS* there is the added simplification that in each case 


{a} © {17} = {6} © {1} =0. 
Just two cases arise, 
(a) = (3), (0) = (0); (@) = (2), (@) = (1) 


with I'ge, = 1 in each case. 
Hence 


({3} @ {17}) © {2} 


{3} @ {2} + ({2} @ {2}) {2} 
= 2(6} + {51} + 3[42} + {321} + {2%}. 


5. Generalizations to Higher Degrees. The above results can be 
generalized to give expressions for ({A} {u}) © {v} for (v) a partition of n, 
and for ({A} @ {u}) © {»} for (u) a partition of m, (v) a partition of m. Even 
in the simplest cases the results alone, without any proofs, are highly elaborate. 
It may be worth while to indicate briefly the method. 

Consider first the case 


({m} {n}) © {3} 
where m,n are integers. The third induced matrix is considered of the per- 


mutation matrix corresponding to sets of m symbols a,a2, . . . , a taken from 
a set of m + n symbols. For this, sets of 3 rows are considered, possibly with 





A Tc 





7 





the 


2 


en 


m 








PLETHYSM OF S-FUNCTIONS 15 


repetitions. Suppose that there are a,;, symbols which are present for all 3 
roWS, 4112 symbols which are present for the first two rows but absent from 
the third, and so on with finally a@222 symbols absent from the sets for all 3 
rows. Then permutation is allowable on each of the respective sets of a,1;, 
G12, 2121, 2211, 2i22, A212, A221, Aeee symbols. 

This leads to 


({m}{n}) © {3} = D> (T] face}) 


summed for all solutions of 


DD ain =emtn, > apn = Dd am = DY ayn =m. 


This result must be modified if the set of numbers a; is unaltered by a 
group of permutations of the suffixes. In this case products of equal S-functions 
are replaced by appropriate plethysms. 

The modifications for ({m} {m}) © {21} and ({m} {m}) © {1*} present no 
difficulty. 

If {m} {mn} is replaced by {yz} {v}, with (u) a partition of m, (v) a partition 
of n, then a; is replaced by a partition (a;,). 

In the place of equation, say 


Qiu + Aire + Gia + Gize = mM, 
there is the condition that {yu} appears in the product 
forran} {orrna} forrer} {orrza} 
and {v} in the product 
{avers} foxere} {areas} {area}, 


the appropriate coefficients being introduced into the sum. 

The generalization of ({A} @ {u}) © {v} when (z) is a partition of 3 requires 
an analysis of ({A} {A} {A}) © {v}. Repeated application of the result for 
({A} {u}) © {v} shows that a similar expansion holds for ({A} {A’} {A’’"}) © {vr}. 
Putting (A) = (A’) = (A”) the product {A} {A} {A} is expressed in terms of 
plethysms and symmetrizing operators give ({A} @ {u}) © {v}. The procedure 
is in every way similar when (yz) is a partition of 4 or more. 

In the analysis of the last section, when (u), (v) were partitions of 2, the 
group G of order 8 played a leading réle. This group may be represented as 
the imprimitive group correspondong to {2} @ {2}. When (v) is a partition 
of m and (vy) a partition of m the group which plays the corresponding role is 
the imprimitive group corresponding to {m} @ {nm}. The characters and the 
subgroups of this group have significant applications.’ 

It may be remarked tHat while this paper was being written Murnaghan 
(5) has published a result which is equivalent to Theorem II. He gives no 
proof. 


*See (3) for analysis of characters of this group. 








16 D. E. LITTLEWOOD 


REFERENCES 


1. D. E. Littlewood, The theory of group characters and matrix representations of groups (2nd 
ed., Oxford, 1950). 








2. , The Kronecker product of symmetric group representations, J. London Math. Soc., 
81 (1956), 89-93. 

3. , The characters and representations of imprimitive groups, Proc. Londcn Math. Soc., 
(3) 6 (1956), 251-266. 

4. 





, Plethysm and the inner product of S-functions, J. London Math. Soc., 32 (1957), 
18-22 


5. F. D. Murnaghan, On the generation of irreducible representations of the symmetric group, 
Proc. Nat. Acad. Sci., U.S.A., 41 (1955), 514-515. 


6. G. de B. Robinson, On the representations of the symmetric group, 111. Amer. J. Math. 70 
(1948), 277-294. 


University College of North Wales, 
Bangor 








an 











PRODUCTS AND PLETHYSMS OF CHARACTERS 
WITH ORTHOGONAL, SYMPLECTIC AND 
SYMMETRIC GROUPS 


D. E. LITTLEWOOD 


1. Introduction. Murnaghan (9) has proposed the following method of 
analyzing the Kronecker product of two symmetric group representations. 

If (A) = (Ai, Aa,...,A4) is a partition of p, the representation of the 
symmetric group on m symbols corresponding to the partition (m — p, Ai, ...,A,) 
is denoted by [A] and is said to be of depth p. 

If [A] is of depth p and [yz] of depth g, then the terms in the Kronecker 
product [A] X [u] of depth p + g are terms which correspond to the terms 
in the product of S-functions {A} {u}. Murnaghan gives a similar formula for 
the terms of depth p+q-—1, p+q-—2, p>+4q-—3 and p — gq. He uses 
these formulae to work out some of the terms in particular cases and uses 
various artifices to complete the analysis. But he gives no proof of the formulae 
and it is by no means clear what is the general result for terms of depth 
p+q-r. 

In this paper there will be obtained the equivalent of Murnaghan’s formulae, 
proof of the results, and extension to the general result, so that a complete 
method of analysing the Kronecker product of symmetric group representations 
will be obtained, or equivalently, of expanding the inner product of two 
S-functions (5). 

The method extends to give an analysis of the invariant matrices of 
symmetric group representations, and thus yields the most powerful method 
so far obtained of calculating the inner plethysm of S-functions (6). 

In addition the method can be used with even greater simplicity to calculate 
products and plethysms of characters with orthogonal and symplectic groups. 
These cases, being simpler, will be dealt with first. 


2. Products of orthogonal and symplectic group characters. In this 
section and in 3, when (A) is a partition of m, £\} will denote the character 
of the orthogonal group which is associated with this partition (4, p. 233). 
The number of variables will generally be assumed to be large, at least twice 
the number of parts in any partition. For a smaller number of variables the 
correct result can always be inferred by using the modification rules (8, 
p. 282). 

It is required to find a formula which expresses a product £A} {4} in terms 
of orthogonal group characters. 


Received February 8, 1957. 











18 D. E. LITTLEWOOD 


If (A) is a partition of n, let 
A = A ty te... ty 


be a tensor of type £\}, which implies that under the full linear group it is of 
type {A}, but that it is further reduced so that all contractions with the metric 
tensor g‘’ are zero. 

Similarly, if («) is a partition of m, let 


B= B 4, 49... 4m 
be a tensor of rank m and type {yu}. Consider the product 
AB = A tnt Bae 


Under the full linear group this is of type corresponding to the product 
{A} {wu}, but some of the contractions with g‘’ are zero while others are not. 
The suffixes 7, 7 cannot be contracted with a pair of suffixes of A, nor with a 
pair from B, for these would lead to a zero result since the contractions have 
already been removed. But contraction is still possible if one suffix is con- 
tracted with a suffix of A and the other with a suffix of B. The contraction 
gives a non-zero result in the general case since this contraction has not 
previously been removed. 

Let the product AB be contracted with a concomitant of degree r of g*, 
the r first suffixes being contracted with A and the r second suffixes with B. 
Let the r first suffixes be subject to symmetrizing operators corresponding to 
the S-function {£} of weight r, and the r second suffixes corresponding to the 
S-function {y} of weight r. The symmetrizing operator on the g‘’’s therefore 
corresponds to {£} -{»}. Since the g‘’’s are equal the only possible symmetrizing 
relation between them is the symmetric one corresponding to {r}. Hence 
{£}-{m} must contain {r}, which is only possible if {£} = {n}. 

The contraction of A with a contravariant tensor of type {£} is of type 
> Ten{t} where Ten is the coefficient of {A} in {£} {¢}. The contraction of 
B is of type >> Te, {r}. 

Hence the contraction of the product A B is of type 


DX Ver Venlt} fr}. 
The principal part of this contracted tensor is in general distinct from zero. 
Hence for each S-function which appears there is a corresponding orthogonal 
group character in the expansion of A} 4u}. This is true for every suitable 
S-function {£}. 
THEOREM J. Jf 
7 Tera Pelt} fv} - > Kiyetp} 
the summation on the left being with respect to all possible S-functions including 
{&} = {0}, then 
4\}{u} - p Kyypp}- 





aS 


th 


F 


—_ mt eo melUlUreMmlCUMl CU COS 





of 


ic 





PRODUCTS AND PLETHYSMS OF CHARACTERS 19 


As an example consider the product €2} 1%}. Corresponding to {£} = {0}, 
the product is 
{2} {17} = {31} + {21°}. 


For {&} = {1}, the product is 
{1} {1} = {2} + {1} 
and no other value of {£} is possible. Thus 
42} £135 = £31} + 4215 + £2} + 414 


a result easily checked by other means. 

The method is equally applicable to the symplectic groups. The skew- 
symmetric fundamental form is available for contractions in exactly the 
same way as the symmetric metric. In view of the different significance of 
orthogonal group and symplectic group characters it seems somewhat surpris- 
ing that the multiplication laws for the two sets of characters are identical. 


THEOREM II. Under the same conditions as in Theorem I 


(A)(u) - ) Kipp). 


Here (A) denotes a symplectic group character. 


3. Plethysm with orthogonal and symplectic group characters. One 
of the pleasing features of the method is that it extends directly to plethysm. 
Formerly the general method of evaluating say 4A} @ {u} was to express 
4\} in terms of S-functions, evaluate the plethysm and convert back into 
orthogonal group characters (4, p. 94). This made the labour of calculation 
rather tedious, and it was to avoid this tedious calculation that work was 
done showing that the orthogonal groups in certain numbers of variables 
were simply isomorphic with certain other groups (3). But this only simplified 
the problem in certain cases, notably in 3 and 4 variables. 

The method described here gives a general method which will evaluate for 
the orthogonal group £A} @ {u}, or for the symplectic group (A) @ {x}, in 
any number of variables. 

The method is best described by means of an example. Consider £21} @ {2}. 
The expansion of £21} £21} may be obtained from Theorem I, and this is equal 
to 

£21} £21} = £21} ® {2} + £21) @ {1}. 
To calculate £21} £21} the following terms are obtained 


{21} {21} = {42} + {417} + {37} + 2321} + {2°} + (31*} + {271%}, 


{2} {2} = {4} + {31} + (2%, 

{17} {17} = {27} + {217} + {14, 

{2} {1°} = {31} + {21%}, 

{17} {2} = {31} + {21%}, 

{1} {1} = {2 + {1%}, UL CL} = (2) + (14, 
{0} {0} = {0}. 











20 D. E. LITTLEWOOD 


Considering the product of two equal tensors of rank 3, A ¢j Ap¢, it is clear 
that, of the terms corresponding to the product {21} {21}, the symmetric 
part corresponds to {21} @ {2} and the skew-symmetric part to {21} @ {1°}. 

Of the first contraction g”A ; A,,, the terms corresponding to {2} {2} are 
changed into themselves by the interchange of A; and A,,,. The symmetric 
part will thus correspond to {2} @ {2} and the skew symmetric part to 
{2} @ {17}. Similar results are obtained for the terms corresponding to 
{17} {1%}. 

There are certain terms which correspond to {2} {1°}. Interchanging A ji 
and A,,, these are changed into different terms corresponding to {1°} {2}. 
Clearly there is no reduction here corresponding to plethysm, but just one 
of the two products {2} {17} and {17} {2} is retained either for £21} @ {2} 
or for £21} @ {17}. 

Treating all the terms in this way the expansion of £21} @ {2} corresponds 
to the expansion 


{21} @ {2} + {2} @ {2} + {17} @ {2} + {2} {17} + 2{1} @ {2} + {0} @ {2} 
= {42} + {2°} + {321} + {319} + {4} + 2{27} + {14} + {31} 
+ {217} + 2{2} + {0}. 

Hence 


421} @ {2} = £42} + 424 + £821) + £3815 + 44 + 242% + 419 + £31} 
+ £217} + 2€2} + £0}. 
THEOREM III. Jf (u) is a partition of 2, then 


£03 ® {fu} = Dd) Aut} 


where 


D Awelr} = D (Teala}) @ fu} + DO TemTen{atit}, (a) ¥ &). 


summed for all suitable S-functions {£}, {n}, {¢|, the last term not being repeated 
for the interchange of {n} and {f}. 


The only aspect of the Theorem which is not obvious from the above 
example is the position of the coefficient I’;,, in the first summation when this 
coefficient exceeds 1. Such a case occurs for £321} @ {2} when {f} = {21}, 
{n} = {21}. In this case T'y,, = 2. Referring to contractions of the product 
of two tensors, there will exist for each tensor two corresponding contractions 
of type {21}. If the same contraction is taken for each tensor, these will be 
interchangeable to give a term corresponding to {21} @ {2} in each case. 
If, however, different contractions are taken for the two tensors these will 
not be interchangeable, and there will be a term {21} {21}. 

Taken together these terms correspond to 


2({21} @ {2}) + {21} {21} = (2{21}) @ {2}. 


The generalization for any value of I’; presents no difficulty. 





= 


in 


Cc 
. 





Ww we fF & e« 





A AM 


PRODUCTS AND PLETHYSMS OF CHARACTERS 21 


For the symplectic group the result is slightly different. When the two 
tensors are interchangeable allowance must be made for the skew-symmetry 
of the fundamental form. The difference occurs only when (£) is a partition 
of an odd number, say a partition of m = 2k + 1. 

Let the fundamental skew-symmetric tensor be r‘’. A single contraction 
between the two tensors, or a contraction m times, will introduce a skew- 
symmetric relation between the two tensors. This will have the effect of 
changing {ny} @ {xu} into {nm} @ {@} where (f) is the partition conjugate to 
(u). 


THEOREM IV. Jf (yu) ts a partition of 2 
(A) ® fu} = Do Sawer) 
where 
DY Jawefv} = DE (Tealn}) @ (fu) . fe) 
+L Tea Pealah (s, (n) ¥ (f), 

in which (€) = (2) af {£} is of even weight, but («€) = (1*) if (&) is of odd weight. 

As an example the expansion of (21) @ {2} will be obtained. The corres- 
ponding expansion is 

{21} @ {2} + {2} @{17} + {17} @ {14} + {2} {17} + 2f1}@l{2} + 0} @{17} 

= {42} + {2°} + {321} + {31°} + 231} + 2{217} + 2{2}. 
Thus 
(21) @ {2} = (42) + (2*) + (821) + (B1*) + 281) + 2(21*) + 2(2). 


To obtain the expansion of 4A} @ {u} where (uz) is a partition of 3, a pro- 
cedure is adopted which will first be illustrated with an example. To evaluate 
43} @ {3} consider the product of 3 equal tensors of rank 3, each of type £3}, 
say 

A syz Ager Aste: 
Leaving out contractions with respect to the fundamental tensor g‘’, the type 
is 


{3} @ {3} = {9} + {72} + {63} + {527} + {471}. 


Allowing one contraction with g‘’ between the first and second tensors, these 
two tensors remain symmetric and the type is 


({2} @ {2}) {3} = {7} + (61} + 2(52} + [43} + [421} + (32%}. 


One contraction betwéen first and second, one between first and third 
tensors allows the symmetric interchange of the second and third tensors 
to give 


(1} ({2} @ {2}) = {5} + {41} + (32) + {21}. 














22 D. E. LITTLEWOOD 


Three contractions, one between each pair, allow the three tensors to be 
permuted symmetrically, and give 


{1} @ {3} = {3}. 
Two contractions between first and second tensors give 
({1} @ {2}) {3} = {5} + {41} + {32}. 
Two between first and second, one between first and third give 
{1} {2} = {3} + {21}. 


Two between first and second, one between first and third, one between 
second and third gives 
f 1} 
i] . 


Finally three contractions between the first two tensors gives 


{3}. 
Hence 


£3} ® {3} = £9} + £72} + £63} + £524 + 44°1} + £7} 
+ £61} + 2452} + £43} + £421} + £824 + 2£5} 
+ 241} + 2482} + £271} + 343} + £21} + £1}. 


Consider now the general case £\} @ {u} with (u) a partition of 3. It is 
required to obtain the contractions of the product of three tensors, each of 
type A}. Let the contractions with g*’ between the first and second correspond 
to the S-function {y}, between the first and third to {8} and between the 
second and third to {a}. The two sets of contractions of the first correspond 
to {8} and {vy}, so that the contracted tensor is of type >> Tysy{t}, where 
T's is the coefficient of {A} in the product {£} {8} {vy}. The type of the con- 
tracted product is thus 


D Veo Veer Papa {€} {0} {f}. 


Allowing permutations of the three tensors each such term is repeated 6 
times, except in certain cases of equality. But only f™ of the 6 terms are 
retained for £4} @ {u}, where f™ is the degree of the representation corres- 
ponding to (uz) of the symmetric group on 3 symbols. 

Consider now the cases of equality. Such a case arises when {a} = {8}, 
{t} = {nm}. The corresponding term for £A} £A} 4A} is 


Tie T raed {£} {&} {fg}. 


The interchange of the first two tensors leaves this unaltered. It has the 
effect of interchanging the two a’s in the coefficient Trae, and also the two 
{é}’s in the product {£} {£}. In the case of £4} @ {3} either both interchanges 
must be symmetric or both skew-symmetric. Let Ire, be the coefficient of 
{A} in 





and 


7 
s 


al 


fe 








PRODUCTS AND PLETHYSMS OF CHARACTERS 23 


(¢} (la} @ {2}), 


and Ire, the coefficient {A} in 
{5} Cla} @ {1%}). 
The corresponding term for A} @ {3} is 
LD VrerlE}[(Peemlé}) @ (2}] + Do Proalt}((Peealé}) @ {1"}]. 


The term for £4} @ {1°} is obtained from this by interchanging a’ and a”. 
Since {21} appears in both the products {2} {1} and {17} {1}, the corres- 
ponding term for 4A} @ {21} is the sum of the two, or 
Le Vaan Vor l€} (€} (5}- 


The cases when {a} = {vy}, {&} = {ff}, or when {8} = {yj}, {a} = {f} 
become equivalent to the above case by a rearrangement of the three tensors, 
and these cases need not be considered. 

There remains only the case 


{a} = {8} = {ry}, {€} = {mn} = {fr}. 
The numerical coefficient becomes 
Tica. = (ern + Meera)? = Phen + Vien + 30 Gen Meern + 30 Geen Peer 


The terms which correspond to 


Tian Pear 
and to 
Tien Peer 
are treated in the same way as the case considered above when {a} = {6}, 
{€} = {n}. 
The term 


Tien 
for £4} @ {u} corresponds to 
} (Teen f€}) ® {yu}. 


The term 
Tie" 


implies a skew-symmetry for every interchange among the 3 tensors. This 
has the effect of converting {uz} into {7}, (@) being the conjugate partition to 
(x). 


THEOREM V. /f (u) is a partition of 3, 


403 @ {u} = LS Arwtv? 


where 














24 D. E. LITTLEWOOD 


DL Bawle} =f£° DS Pes Veer Pasa fé} {0} (5) 
+ DO (Peenlt}) @ {u’} Pranlt} + DO (Meenlt}) @ {a} Prealt} 
+ Do (Peealt}) @ fu} + DO (Teenlt}) @ {a} 
+ DO (Teerlt}) @ (wy Peealt} + DO (MewnlE}) @ {u'} Pewnl€}. 


In this expression, f™ is the degree of the representation corresponding to 
x™; (u’) = (2) if Gu) = (3), @’) = (1°) if () = (1°), @’) = (2) + (1%) if 
(u) = (21); Tes, is the coefficient of {dX} in {£} {8} {vy}, Tee is the coefficient 
of {A} in {&} ({a} @ {2}), Tea is the coefficient of {A} in {£} ({a} @ {1%}. 
Terms are omitted in any summation when cases of equality lead to corres- 
ponding terms in a later summation. 

The case of the symplectic group is very similar. The only differences arise 
by making allowance for the skew-symmetry of the fundamental form. 


THEOREM VI. Jf (u) is a partition of 3, 


(A) @ fu} = DE Janel) 
where the definition of > Jiy»\v} differs from that of > Hy,{v} in Theorem V 
only by the interchange of {u'} and {j’} in the second and third summations 
when (vy) is a partition of an odd number, and by the interchange of {u'} and 


{a}, tu} and {pf} im the last 4 summations when (a) is a partition of an odd 
number. 


The method extends readily to the cases when () is a partition of 4, 5, 6, 
etc., no essentially new concept being required. But the details become more 
and more complicated. The statement of a Theorem even for » = 4 must 
involve so many special cases that it does not seem worth while to enunciate. 


4. Symmetric group representations. The method can be applied to 
representations of the symmetric group, the results being equivalent to 
evaluating the inner product (5) and the inner plethysm (6) of S-functions. 

Henceforward in this paper [A] = [Ax,..., A], where (A) is a partition of 
b, will denote the S-function {m — p,Ai,..., Ai}. 

The symmetric group of permutations on m symbols is the group of n-rowed 
permutation matrices, and is thus a sub-group of the full linear group on n 
variables x;, x2, ... , X,. It is in fact the restricted group which leaves invariant 
a set of forms of respective degrees 1, 2, .. ., n, namely, the forms 


5; = a x, Se = > a 2 > Xo. 

Clearly a permutation of the x,'s will leave invariant these symmetric 
functions of the x,’s, and conversely if the values of S:, S:,...,S, are assigned 
the x,'s will be the roots of a certain equation of degree n and the only possible 
transformations will be the permutations of the roots. 

The tensor coefficients of these forms will be denoted by R,, Rij, Rij, ete. 
Although the forms are algebraically independent, the tensors are connected 








a 
r 





Vig 


al}. 
ig to 
2) if 
cient 
{1%}. 
rres- 


arise 





PRODUCTS AND PLETHYSMS OF CHARACTERS 25 


in that every one can be expressed as a concomitant tensor of the quadratic 
and cubic tensors R,, and R,. Since the quadratic tensor R,, is available for 
raising and lowering suffixes, which it does without modification, there is no 
distinction between upper and lower suffixes. Then clearly 


Ry Rix _ R, 
Rin Rive - Ru pa 


with similar results for tensors of rank 5, 6, etc. 
Since transformations which leave certain tensors invariant also leave 
invariant every concomitant tensor, the following Theorem results. 


THEOREM VII. The symmetric group on n symbols is the subgroup of the full 
linear group in n-variables which leaves a quadratic form and a cubic form 
invariant. 


The linear concomitant can be used to reduce the number of variables from 
n to n — 1. The full linear group in m — 1 variables may therefore be taken 
if it is assumed that the linear concomitant is identically zero. 

The characters of the symmetric group can be obtained from those of the 
full linear group in a similar manner to that used for the orthogonal group, 
namely by considering a tensor corresponding to any partition (A) of any 
integer m, and removing all possible contractions with the fundamental forms 
(2, p. 392). The remainder when all contractions are removed is an irreducible 
character, provided that  — p > Ax, and it is not difficult to see that it is in 
fact the character of the symmetric group corresponding to the partition 
(nm — p,X1,...,A,4). It is convenient to represent by [A] not this character, 
but the corresponding S-function 


[A] = {m — p,ra,..., Ad. 
The inner product and inner plethysm of these S-functions correspond exactly 
to products and plethysms of symmetric group characters (5; 6). 
THEOREM VIII. Jf 
A tid2... te 


is an irreducible tensor under the symmetric group corresponding to the S-function 
[A], where (A) is a partition of r, then for every non-zero term in the tensor all the 
suffixes will be different. 


To prove this it is sufficient to show that if two suffixes are equal there 
exists a non-zero contraction. Suppose that 7; = 72 and consider the contraction 


Bits ae Rig A iu42... tr 


Then the term corresponding to j = i; = i, is non-zero and the contracted 
tensor does not vanish, contrary to hypothesis. The Theorem follows readily. 
Consider now the inner product of two S-functions [A].[u], where (A) is a 














26 D. E. LITTLEWOOD 


partition of p and (z) a partition of g. Corresponding to [A], [u] respectively 
are two tensors 


A=Any.. B= Byy,.. 
These two tensors are of type {A}, {u} respectively over the full linear group, 
but have had removed from them all contractions with Ry, Ris. 


The product is of type under the full linear group corresponding to the 
product 


- ty - Iq 


[A}{u} = DD Pavele}. 
The subtensor corresponding to {v} will have had some, but not all, of its 
contractions removed. The principal part of this tensor, of type [v], will not 
be zero since no contraction at all is involved. It follows that the inner product 
[A].[«] includes >> L,,,[v]. To determine what other terms are involved it is 
necessary to determine what non-zero contractions can be formed with 
Riz, Rij, ete. 

Since the suffixes in A are all distinct and the suffixes of R,,, Rig or Rijep 
are all equal, it is clear that for a non-zero result only one contraction can 
occur between these two tensors, and only one contraction between the 
fundamental tensor and B. Thus exactly two of the suffixes of the fundamental 
tensor can be contracted away. If the fundamental tensor is quartic, two 
suffixes remain and there is no reduction in the rank. The contractions we are 
seeking, however, are of lower rank. There are thus two possibilities, 
contraction with R,,, just as for the orthogonal group, and contraction with 
Rin» leaving one uncontracted suffix in the place of two. 

If the tensor R,, is used r times the suffixes removed from A will correspond 
to a partition (a) of r, and since the tensors R,, are symmetrically disposed, 
the r suffixes removed from B will correspond to the same partition (a) of r, 
just as with the orthogonal group. 

Suppose that the tensor R;, is used s times, the suffixes removed from A 
will correspond to a partition (8) of s, the suffixes removed from B toa partition 
(y) of s. In order that the s tensors R,, may be symmetrically disposed the 
remaining s uncontracted suffixes of the R, must correspond to {8} .{-}. 

The types of the contracted tensors A and B respectively are 


DX Tesalt} and DY) Teya{n}. 
The type of the contracted product is thus 
Lo Papen Verne lé} {0} (18) . fy}). 


THEOREM IX. The inner product of two S-functions [d], [nu], each of weight 
n, is given by 


[A] fu] = 20 Pavol?) 


where 


DX Parole} = DY Vase VaveelE} {0} ({8} . {7}). 





Ti 
(a 





rely 





PRODUCTS AND PLETHYSMS OF CHARACTERS 27 


The coefficient Ts, is defined as the coefficient of {A} in the product 
{a} {8} {€}. The summation is with respect to all suitable partitions (a), (8), 
(y) of which (8) and (7) are partitions of the same integer. Those cases are 
included for which (a) = (0), and/or (8) = (vy) = (0). 

As an example consider the inner product [21] -[21]. First, 

{21} {21} = {42} + {417} + {37} + 2{321} + {2°} + {31°} + {271%}. 
Take next the cases for which (8) = (7) = (0) with respectively (a) = (1), 
(a) = (2), (a) = (1?), (a) = (21). These give 

({2} + {17}) ({2} + {17}) + {1} (1) + {1} {1} + {0} 

= {4} + 3{31} + 2{27} + 3{217} + {14} + 2{2} + 2{17} + {0}. 

Next with (8) = (vy) = (1) and (a) respectively (0), (1), (2), (1%) the 

following terms result 


({2} + {17}) ({2} + {17}) C1} + 4f1) (1) 1) + (1) + (1) 
= {5} + 4{41} + 5{32} + 6{312} + 5{212} + 4/21%) + {15} 
+ 4{3} + 8{21} + 4{1°} + 2{1}. 


With (8) = (y) = (2), and (a) = (0), (a) = (1), the terms are 
{1} {1} {2} + {O} {0} {2} = {4} + 2{31} + {27} + {217} + {2}. 
Also (8) = (y) = (1?) gives exactly the same result. 
But (8) = (2), (y) = (1?) gives 
{1} {1} (17) + (O} (O} {17} = {31} + (27} + 2(217} + {14} + {1%}, 
with precisely the same result for (8) = (17), (y) = (2). 
Lastly when (8) = (vy) = (21) the result is 
{21} -{21} = {3} + {21} + {1%}. 
Hence 


[21] -(21] = [42] + [41°] + [37] + 2(321] + [2°] + [31°] 
+ [2717] + [5] + 4[41] + 5[82] + 6[31*] + 5[2°1] + 4[21°] + [1°] 
+ 3[4] + 9[81] + 6[27] + 9[21*] + 3[1*] + 5[3] + 9[21] + 5[1*] + 4[2] 
+ 4{17] + 2[1] + (0). 


The result conforms with that given by Murnaghan (7). 


5. Inner plethysm of S-functions. The method extends immediately to 
the evaluation of inner plethysms. To evaluate [A] © {u} where (uz) is a 
partition of 2, consider the inner product [A]-[A] as given by Theorem IX. 
The coefficients P,, are obtained from the expression 


D Taser Vaya {E} {0} ({8} . {}). 


For each term in this expansion there is an equal term obtained by inter- 











28 D. E. LITTLEWOOD 


changing {8} and {y}, {&} and {»}. Provided that these two equal terms are 
distinct, one only of the pair will appear for [A] © {2} and one for [A] © {1%}. 

The situation is different if {£} = {}, {8} = {vy}, for then the interchange 
of the two factors [A] in the inner product changes this term into itself. It is 
therefore possible to separate the symmetric and the skew-symmetric com- 
ponents of the product. This term in [A].[A] is then 

Pesea{€} {€} ((8} .{8}) = Pese({E} @ {2} + {é} @{1"})({8}O {2} +14} 0117}. 

Of the four terms obtained by expanding the right hand side the choice of 
{€} @ {1°} rather than {£} @ {2} indicates a change of sign for the inter- 
change. Similarly the choice of {8} © {1°} rather than {8} © {2} also indicate 
a change of sign. 


THEOREM X. If (u) is a partition of 2, 


[A] © {u} = DO Orel?) 


where 
DX Orurle} = Do Paper Pavealé} {nm} ({8} . fv}) 
+ Do [(Taselt}) ® {2}({8} © {u}) 
+ (Tase{t} @ {17}({8} © {a})). 


In the first summation the term is not repeated for the interchange of 


{€} and {nm}, {8} and {y}, and those terms are omitted for which {£} = {y}, 
{8} = {y}. 


As an example consider [21] © {2}. Of the terms considered above for 
[21]-(21], the term {21} {21} is replaced by 


{21} @ {2} = {42} + {321} + {31°} + {2%}. 
The cases (8) = (y) = (0), (a) = (1), (2), (1°), (21) give 


{2} @{2} + {17} @{2} + {2} {17} + {1} @{2} + {1} @{2} + {0} @{2} 
= {4} + {31} + 2{27} + {217} + {14} + 2{2} + {0}. 


For (8) = (vy) = (1), (a) = (0), (1), (2), (1%), the terms are 


({2} @ {2} + {17} @ {2} + {2} {17}) {1} + [(2f1}) @ {2}) (1) + (1) + {J 
= {5} + 241} + 3{32} + 2{317} + 3271} + 2{21%} + {1°} + 3{3} 
+ 4{21} + {1°} + 2{1}. 


For (8) = (vy) = (2) with («) = (0), (1), 
({1} @ {2}) {2} + {0} {2} = {4} + {31} + {2%} + {2}, 


with an equal result for (8) = (y) = (1°). 
For (8) = (2), (vy) = (1*) the term is the same as in [21].[21], namely, 


{1} {1} {17} + {0} {0} {17} = {31} + {2%} 4+ 2{217} + {14 + {14, 





but 


t 
c 
1 
I 
1 
{ 









but this is taken once only. Lastly for (8) = (vy) = (21) the terms are 





PRODUCTS AND PLETHYSMS OF CHARACTERS 


{21} © {2} = {3} + {21}. 
Summing this gives 


[21] © {2} = [42] + [321] + [31°] + [2*] + [5] + 2[41] + 3(32] + 2317] 
+ 3[271] + 2(21*] + [1°] + 3[4] + 4[31] + 5[2*] + 3[21*] + 2[1*)] + 4[3) 
+ 5[21] + [1°] + 4[2] + [17] + 2(1] + (0). 


This result has been checked as follows. Taking » = 10, it gives the expansion 
of {721} © {2}. The total degree of the representations on the right is then 
found to be 12,880 which is correctly equal to 4(160 X 161). 

The extension to [A] © {uz} where (x) is a partition of any integer, is straig!it- 
forward, but becomes complicated even in comparatively simple cases because 
of the multiplicity of the possible contractions with R,,, Rig, Rip, etc. It 
does not seem worth while to attempt to express a general theorem, but the 
method will be illustrated with respect to the comparatively simple case 
[2] © {3}. 

Denote by Ci: contractions from the first and second tensors with R,,, 
by Ci’ similar contractions with R,. Denote by Ci; contractions from 
all three tensors with R,, and by Cj:;’ similar contractions with R;,p». 

The possibilities will be listed below. 


Cis : {2} {2} = {4} + {31} + {2%}. 
Cis’: {2} {241} = {5} + 2/41} + 2/32} + {317} + {271}. 
2 : {2} = {2}. 
Ci2 Cis’: {2} {1} = {3} + {21}. 
Cie : {2} {2} = {4} + {31} + {27}. 
Cie Cis : {2} = {2}. 


Cre Crs’: {1} {1} {1} {3} + 2{21} + {1°}. 
Cia’ Cis’: {2} {2} + {17} {17} = {4} + {31} + 2(2} + (217) + {1}. 


Before proceeding, a word of explanation may be needed here. There are 
two tensors R,, the first suffix of each being contracted with the first tensor 
of type [2]. The suffixes are necessarily symmetric. The second suffixes, con- 
tracted respectively with the second and third tensors, can be either sym- 
metric or skew-symmetric. The uncontracted suffixes of the second and third 
tensors will likewise be symmetric or skew-symmetric in the respective cases. 
Further, if the second suffixes of the tensors R;, are skew-symmetric, then the 
third uncontracted suffixes must also be skew-symmetric. 

The remaining possibilities are as follows: 


Cr Cis Cas : {0}. 
Ci2 Cis C23’: {1}. 
Cis Ci3'C23": {2}. 
Cio Cis Cos : {3}. 
Ci23 : 13}. 














30 D. E. LITTLEWOOD 


Ci23": {3} {1} = {4} + {31}. 
Ci2z Cie : {1}. 
Cros Cia’: {1} {1} = = {2} + {1°}. 
Cizs'Cin: {1} {1} = = {2} + {1°}. 


Cy23'Ci2’: {1} {1} fl} 


{3} + 2{21} + {1°} 


Con {0}. 
Cis Cisa’: { I }. 
Ciss : {2}. 


Summing 


[2] © {3} = [6] + [42] + [2°] + [5] + 2[41] + 2[32] + [817] + [271] 
4[4] + 4[31] + 4[2*] + [21°] + [1*] + 5[3] + 5[21] + 2[1*] + 6[2] 
2(17] + 3[1] + 2(0). 


This result has been checked by obtaining the total degree of the represen- 
tation for S-functions of weight 10, and gives correctly 


7,770 = 35.36.37/6. 


There is one case of special importance for which a general formula can 
be found. That is the case [1] © {uz}. This is equivalent to expressing the 
general S-function {u} as a sum of symmetric group characters, when the 
symmetric group is regarded as a sub-group of the full linear group. 

To express the result it is convenient to employ an analogue of differential 
operators, following the method of Foulkes (1). Let 


D({r}){u} = Do Paul}. 


Such operators satisfy 


D(idA} + {w}) = D({A}) + D({a}), 
D({d} {u}) = D({A}) D({a}). 


Let (u) be a partition of m. It is required to find [1] © {uz}. Following the 
method used for the orthogonal group (4, p. 393), consider a general tensor 
of type {uz} and note all the tensor forms that can be obtained by contractions 
with the fundamental tensors. 

Consider first the tensor R;,, repeated r times. The total set of 27 contracted 
suffixes correspond to a concomitant of R,,, and therefore to a term in the 
expansion of {2} @ {r}. The type of the tensor corresponding to {yu} is there- 
fore reduced to D({2} @ {r}) {u}. Similarly for complete contractions with 
Ri the corresponding operator is D({3} @ {r’}), and so on. There remain to 
consider contractions with fundamental tensors which leave one suffix of the 
fundamental tensor uncontracted. 

Suppose that there are s tensors R,, for which the first two suffixes only 
are contracted with the tensor of type {u}. The type of the contracted suffixes 
is {2} @ {A} for some partition (A) of s. In order that the s tensors R,,,. may 





Sin 


th 






























PRODUCTS AND PLETHYSMS OF CHARACTERS 31 


be symmetrically disposed the last suffixes must correspond to the same par- 
tition (A). The type of the operator is therefore 


{A} D({2} @ {A}) {a}. 


Similar results hold for the fundamental tensors of rank 4, 5, 6, etc. 


THEOREM XI. Jf 
Dd {Ae} {As}... {Ac} D({2} @ {ro}) ... D(fé} @ {As}) D({2} @ {re})... - 
.» D({G} ® tr) tu} = DO Vurle) 
the summation on the left being with respect to any combination of partitions 


(Az), .-+, (As) and of integers ro,...,7; tmcluding (Ax) = (0) and r, = 0, 
then 


[1] © fu} = Lo Viole. 
The following examples illustrate. Consider first [1] © {21}. Since 
(1 + D({2}) + {1} D{2}] {21} = {21} + {1} + {2} + {1%}, 


therefore 
{1] © {21} = [21] + [2] + [17] + [1]. 


As a check, for » = 6 this gives 
{51} © {21} = {321} + {42} + {417} + {51}. 


The degree of the representation is, correctly 


44.5.6 = 16+ 9+ 10 + 5. 


Next consider [1] © {31}. Since 


(1 + D({2}) + {1} D({2}) + {17} D{31}) + {1} D({2}) D2}) 
+ D({3} + {1} D({3})] {31} = (31) + (2) + U1} + 13} + 221) 
+ {1} + {17} + EL} + U1} + (2) + {14}, 


therefore 
[1] © {31} = [31] + [3] + 2[21] + [1*] + 2[2] + 3[1*] + 2[1). 
This formula for n = 7 gives 
{61} © {31} = {371} + {43} + 2{421} + {41°} + 2{52} + 3{517} + 2{61}. 


The degree of the representation gives, correctly 


3 5.6.7.8 = 21 + 14 + 70 +20 + 28 + 45+ 12. 














32 D. E. LITTLEWOOD 


REFERENCES 


1. H. O. Foulkes, Plethysms of S-functions, Phil. Trans. Roy. Soc. (A), 246 (1954), 555-591. 
2. D. E. Littlewood, On Invariant theory under restricted groups, Phil. Trans. Roy. Soc. (A), 
239 (1944), 387-417. 





3. , Invariant theory under orthogonal groups, Proc. London Math. Soc. (2) 50 (1948), 
349-379. 

4. ———, The theory of group characters and matrix representations of groups, (2nd edn., 
Oxford, 1950). 

5. 





-, The Kronecker product of symmetric group representations, }. London Math. Soc. 
31 (1956), 89-93. 

6. ———, The inner plethysm of S-functions, Can. J. Math. 10 (1958), 1-16. 

7. F. D. Murnaghan, The analysis of the Kronecker product of irreducible representations of 
Sn, Amer. J. Math., 60 (1938), 761-784. 

, Theory of group representations (Baltimore, 1949). 

——., The analysis of the Kronecker product of irreducible representations of S,, Proc. Nat. 

Acad. Sci., 41 (1955), 515-518. 





eo @ 


Unwersity College of North Wales 
Bangor 











lat. 





A SPECIAL FORMULA FOR THE LIE 
CHARACTER 


ROBERT L. DAVIS 


In 1942 Thrall (3) introduced what he called the Lie character to study the 
structure of the free Lie ring. These characters are defined by associative 
representations of the full linear group GL,(C). Their importance is seen 
in the fact that, for each n, the splitting of the submodule of Lie forms of 
degree nm into irreducible invariant subspaces is completely specified by the 
Lie character £,. 

Thrall’s original determination of the Lie character depended on a very 
difficult recursive procedure which is hardly feasible for n > 10. Two years 
later, however, Brandt (1) used his results to get an explicit expression for the 
general Lie character in terms of characters of the symmetric group. But even 
using Brandt's formula would be impractical without tables of characters for 
the symmetric group; thus for m > 15 it is of mainly theoretical interest. 

However, there is an important special case where we can so transform 
Brandt’s formula as to permit computation for very much higher n: namely, 
the case in which the number of generators of the Lie ring is 2. This note is 
to derive such an improved formula and several of its consequences. 


1. Preliminaries. Let R be the free associative formal power-series ring 
over the complex numbers in non-commutative indeterminates x,,... , X,. 
If we introduce within R a non-associative operation given by [f, g] = fg — gf 
we can single out a subspace L of R which is isomorphic to the free Lie ring 
on g generators. L is defined recursively as the C-space generated by the x,'s 
and all those “‘bracket products” [f, g] for which f and g are themselves in L. 

Let L, and R, be respectively the submodules of Lie forms of degree m and 
of all homogeneous polynomials of degree m. Then L and R can be written 
as direct sums: 


Le=l,+lLl:+..., 
R=R,+R.+.... 


Next let f; = f; (x1,...,%,) (¢ = 1,...,, ) be a basis for L,, y, being the 
dimension of L, as given by Witt (5). If A: x, ¥ a,, x, is any element of 
GL,(C) then A defines an automorphism of LZ in which each L, is mapped into 
itself. Let Z(A) be the matrix, with respect to the basis of f;,’s, of the trans- 


Received March 14, 1957. These remarks are the substance of Section 4, Chapter II of the 
author’s doctoral dissertation written at the University of Michigan under the direction of 
Professor R. M. Thrall. 

33 














34 ROBERT L. DAVIS 


formation so induced in L,. Then the mapping 4%: A — &(A) is a represent- 
ation of GL,(C) which Thrall called the Lie representation. Now all the irre- 
ducible representations of GL, within R, are well known (4); there is one of 
these, say R,*, corresponding to each partition (A) of m. Hence the Lie 
representation must have the form 
(1) A= aR 

(A) 
where the summation is over all partitions of n. To determine & it is only 
necessary to find the multiplicities ¢.. But to find these we may as well work 
with the Lie character 


(2) fj =TrZ= p> cy {A}. 


Here {A} is the S-function (Schur function) defined by the partition (A). 
Starting from Thrall’s recursive procedure for computing ¢,, Brandt was 
able to derive the explicit formula 


1 n/d 
(3) fn =o > u(d) 34%, 


d\n 


where yp is the Mébius function and sz, Newton's sum function: s4(A) = Tr A‘ 
(which is «+ ...+ ¢,¢ if A has eigenvalues ¢,,...,€,). Since the ex- 
pressions s,4"/* are linear combinations of S-functions whose coefficients are 
characters of the symmetric group ©,, this formula permits computation of 
¢, for nm < 14 (those values of » for which there are tables of characters of 
S,). To go higher one would have to compute sizable parts of the appropriate 
table of characters before beginning. 


2. Restriction to two generators. Nothing has yet been said about the 
relative sizes of g, the number of generators, and m, the ‘‘power’’ of this mth 
power representation. The chief force of the restriction to 2 generators comes 
in the fact that whenever g < n those irreducible representations which corres- 
pond to partitions of m into more than g parts do not appear in the mth power 
tensor representation, and hence not in & either. Another advantage is that 
when g = 2 we can readily derive a simple expression for the symmetric group 
characters that occur as coefficients in the expression for s,"/* in terms of 
S-functions. 

To go from Brandt’s formula to one in S-functions is to change basis in the 
vector space of all homogeneous symmetric polynomials of degree n. One basis 
is that of the S-functions, {A}. Brandt’s formula uses a basis of products of 
Newton sums. If we write 


i) © Bc & 


for the product of the Newton sums 


Su = D3 y5' 





lai 





ent- 
rre- 
e of 

Lie 


ynly 
ork 


was 





A SPECIAL FORMULA FOR THE LIE CHARACTER 35 


(with so set equal to 1) then the S(u)’s provide another basis for this same 
space of homogeneous functions, as (4) runs over all partitions of m. The 
change of basis is given by 


(4) St) = Su) - ++ 15m = De xe iM, 
(A) 
where x,” is the value of the symmetric-group character x* for any argument 
with yw; 1l-cycles, ue 2-cycles, and so forth. 
The only S(u)’s in Brandt’s formula are those of the form s,... 54 (n/d 


factors) for each divisor d of n. Therefore, in view of the fact that the {A}'s are 
a basis we can substitute from (4) in (3) to get 


n= Dam=1E wo(E e0), 
so that 


d u(d) xave, 


(5) a= 


z= i= 


the sum being taken over values of x* for any permutation with n/d d-cycles. 

Since the one-part partition gives tensors which map into zero in L, the 
only (A)’s in our formula when g = 2 are the two-part partitions 
(n — k,k), 1 < k < [§n]. If we now write c,, for the coefficient q, all these 
reductions of the problem can be summarized in 


LemMA 1. When q = 2 the Lie character is given by 
] 


(6) i, = > Cnk {n — k, k}, 


k=l 
where the coefficient Cyx 1s 
1 n—k, 
,- D u(d) x7". 


(7) Cn = 


These character values fall out at once from a consequence of Frobenius’s 
formula. To state this handily it helps to change partition notation and denote 
the class of a typical permutation by (a) = (1*'...s**), where s is now the 
largest part in the partition and a, is the number of k-cycles in any member of 
the class (a). The value of a character x"~** for such a permutation is then 
(2, p. 143) 


@ ae B(G).-G)- 2 (G)---C). 


the sums being taken over all partitions (8) and (vy) of k and k—1 
respectively: that is, over all solutions in non-negative integers of the 
equations 


> if, =kand > iy, =k -1. 
t=] 


t=1 














36 ROBERT L. DAVIS 


LEMMA 2. The character values are given by 


(3)-(2 ) ié= 1, (a) 





ai) ye 
eahit = | ee if d|k, (b) 
n/d ) “ 
| (ave if d|k — 1, (c) 
0 ifd¢k,dt{k — 1. (d) 
In the present “‘a-notation”’ we are writing (d,...,d) as (192°... d™*) s 
that in this case ag =n/d and the other a, = 0. The result follows at once} 


from formula (8). 
Since a, = n/d and all other a’s are zero, the product 


3 


can be non-zero only if (8) is a product of d-cycles. This requires that d|k, and 
then the value of the product is 
( n/d ) 
k/d/° 


All four results are now immediate. 
COROLLARY. For every n > 2, Cn, = 1. 


The only divisor of k = 1 is 1 and (a) says 


wt=(1)-(s)-+» 


On the other hand any d divides k — 1 = 0 so we must take separate account 
of every d > 1. For each of these, by (c), we have 


a—1,1 n/d 
xvi = 0/d =-1. 


Then, using Lemma 1 


1 am 
oie Z u(d) Xan/d 
mM din 
1 
= — (ua) (n—1)+ D ud) (- 1). 
n d\n.d#1 
But taking only the d’s greater than 1 gives }>u(d) = —1, so 


ou =1(n—14 (-1) (- 0) #1. 
n 


This fact is clear, too, from direct examination of the Lie ring. 














rec 





‘d 


) $0 
once 


, and 


ount 





A SPECIAL FORMULA FOR THE LIE CHARACTER 37 


With Lemma 2 it is straightforward to compute ¢, at least within the con- 
siderable range of binomial tables. To do so one would first partition the set 
of divisors d > 1 of m into five sets: 

D, — those which do not divide either k or k — 1 together with any other 
d's which are not square-free (for which we would have u(d) = 0); 

D, — divisors of k which are products of an even number of distinct primes 
(so u(d) = 1); 

D, — divisors of k which are products of an odd number of distinct primes 
(so w(d) = — 1); 


D; — divisors of k — 1 with yu(d) 1, and 


— I, 


D, — divisors of k — 1 with u(d) 


In these terms we can summarize all the results so far as a theorem. To 
reduce parentheses we give the value of nc,, rather than c,, itself. 


THEOREM. For any n > 2and any k with 1 < k < [$n], meq, is 


(*)-G2,)+z (24)- 5 (24)- 5 (a4) 
+2 a a 


CoroLiary 1. For n > 3, G2 = {2-8} 


2 
For if m is odd the sets D,,...,D,4 are all empty so that 
_{n n\ | (nm—3) _ n—3 
tx=\,)/-\,)/=* 9 =* 5 , 
If m iseven then D, = {2} is not empty and 


new = (3) - (7) - (it) = 02g? = o[*54], 


One can likewise write simplified formulas giving c,, for any fixed k > 2, 
for any m. But in no such case are the congruence properties so simple as for 
k = 2, nor is there such a handy notation to summarize them as that of the 
greatest integer function. Thus for k = 3 one must distinguish three cases: 
n = 0 (mod 6), m = 3 (mod 6), and n even but not divisible by 6. On the other 
hand, by fixing m asa prime and allowing k to vary we can at once state 


COROLLARY 2. When n = p is prime, then for all k with 1 < k < [4p], 


rem =(2)-(,2,), 











38 ROBERT L. DAVIS 


REFERENCES 


1. A. Brandt, The free Lie ring and Lie representations of the full linear group, Trans. Amer, 


Math. Soc., 56 (1944), 528-536. 
. F. D. Murnaghan, The Theory of Group Representations (Baltimore, 1938). 


ow 


Math., 64 (1942), 371-388. 
. Hermann Weyl, The Classical Groups (Princeton, 1946). 
. E. Witt, Treue Darstellung Liescher Ringe, J. reine angew. Math., 177 (1937), 152-160. 


oe 


Unwwersity of Virginia. 


. R. M. Thrall, On symmetrized Kronecker powers and the structure of the free Lie ring, Amer. J. 








THE REPRESENTATION TYPE OF ALGEBRAS 
AND SUBALGEBRAS 


J. P. JANS 


1. Introduction. For A an associative algebra with identity over a field 
K, |A: K] < @, and d an integer, we define g,(d) to be the number of 
inequivalent indecomposable A-modules of degree d over K. Following (6), 
we define A to be of finite representation type if 


Di. ga(d) < @. 
A is said to be of bounded representation type if there exists d, such that 
ga(d) = 0 for d > d,; A is of unbounded representation type if not of bounded 
type. We shall say that A is of strongly unbounded type if g,(d) = @ for an 
infinite number of integers d. See (6) and (7) for a number of conditions 
showing algebras to be of strongly unbounded type. 

Now let A be a subalgebra of an algebra [ with [f : K] < © also. In this 
paper we give conditions under which the representation type of A can be 
related to that of I. 

To do this, we must have a process for inducing ['-modules from A-modules 
and conversely. Such processes date back to Frobenius (in the case of group 
representations), have been studied extensively by D. G. Higman (4), and 
are used by Cartan and Eilenberg (2, II, § 6) under the heading ‘‘change of 
rings.’ We shall consider conditions on the algebra [' and the subalgebra A 
under which every indecomposable ['-module is obtained from an indecom- 
posable A-module or conversely. In this way we may relate their representation 
types. 

It should be noted that, unlike (2) and (4), we do not require that the 
identity of A also be the identity of [. This allows consideration of a wider 
class of subalgebras. Also, for M to be a A-module we do not require that the 
identity 1 € A act like the identity on M. By an indecomposable A-module M, 
however, we mean indecomposable and non-trivial (that is, 1M # (0)). With 
these two assumptions, an indecomposable A-module M does have the property 
that lm = m for all m € M, for otherwise, a trivial direct summand could be 
split off. 


2. Algebras and subalgebras. Let M be a two-sided (associative) A-module. 
It is convenient to regard M as a left A*-module, where A‘ = A @x A’ and A’ 
is anti-isomorphic to A (2, IX, §3). Thus if A is a subalgebra of [ then [ isa 
left A*-module and A is a A*-submodule of I. 


Received September 21, 1956. 
39 














40 J. P. JANS 


THEOREM 1. Jf A is a subalgebra of T such that T = A + C (A‘*-direct sum), 
and if A is of (strongly) unbounded type then so is T. 


Proof. Let M be an indecomposable A-module then [ @, M is a I'-module 
(called the induced module J(M) in (4) or the covariant ¢-extension of M in 
(2, II, §6)) which in turn can be considered as a A-module. By the assumptions 
on Aand Tf, T@®,M = (A+C)@,M=AQ,M+C@8,M=M+CQ,M 
(A-direct sum). For the last equality we need the fact that A has an identity 
which acts as an identity on M. This follows from the indecomposability of 
M. 

We then have that T @ ,M contains an indecomposable ['-direct summand 
P(M) such that [M : K] [:K] > [P(M) : K] > [M : K]. To exhibit P(M), 
first decompose I’ @, M into indecomposable [-direct summands and then 
decompose each of those into indecomposable A-direct summands. By the 
Krull-Schmidt theorem (5, V, §13) one of these A-direct summands is M. 
P(M) is the [-direct summand which contains M. 

Thus if A is of unbounded type so is I. If A has an infinite number of inde- 
composable modules, (M,) i € J, all of degree d over the field K, then each 
of the ['-indecomposable modules P(M,) can be isomorphic to no more than 
a finite number of P(M,). Thus if A is of strongly unbounded representatior 
type so is I. 

In the proof of Theorem 1, we could have used the module Hom,(T, M) 
(called the produced module in (4), the contravariant ¢-extension of M in 
(2, II, §6)) instead of I @, M. Under the assumptions of Theorem 1, 
Hom, (Tl, M) = Hom,(A, M) + Hom,(C, M) (A-direct). If 1M = M (for 
example, M indecomposable) then Hom,(A, M) = M. Again using the Krull- 
Schmidt Theorem, there exists a I'-indecomposable module P’(M) with the 
same properties as P(M). 

There are several occasions when a subalgebra A of I will satisfy the hypo- 
theses of Theorem 1. For instance, if H is a subgroup of a finite group G, then 
the subgroup algebra A is a A‘-direct summand of the group algebra I’. The 
A*-complement of A in T has a K-basis consisting of group elements not in the 
subgroup. Theorem | for group algebras and subgroup algebras was proved by 
Higman (3). 

Another important case where the conditions of Theorem 1 are satisfied is 
when ¢ is an idempotent of [ and A is taken to be ee. Here Tf = A + C (Peirce 
decomposition) is A‘-direct. This case was pointed out to us by Higman and is 
contained in the following: 


COROLLARY 2. If e¢ is an idempotent of T and ele is of (strongly) unbounded 
type then so is YT. 


A restricted version of Corollary 2 is used in (6) and (7) in the case that 
ele is the basic algebra of I. 

Higman also noted that the condition [T = A + C(A*-direct sum) of 
Theorem 1 is equivalent to the condition efe = A + eCe (A‘-direct sum) 


















REPRESENTATION TYPES 41 


where ¢ is the identity of A. When these conditions hold and A is of (strongly) 
unbounded type then both ele and [I are also of (strongly) unbounded type 
by Theorem 1. 

Theorem 1 is also applicable to the tensor product of two algebras. 


Coro.iary 3. If T = A @x =, and either A or & is of (strongly) unbounded 
type then so is TY. 


Proof. We need only note that A @x 1 = A is a subalgebra of IT and 
r= > A @xa, 
1 


(A*-direct sum) where (a;) is a K-basis for 2 and a; = 1 in 2. 

Now suppose that the subalgebra is of finite (bounded) type. We consider 
conditions under which the containing algebra is also of finite (bounded) type. 
The following theorem gives one such condition. 


THEOREM 4. Jf A has an identity e and Te @, eT = I + C (T*-direct sum) 
then if A is of finite (bounded) type so is TY. 


Proof. Let M be an indecomposable ['-module, [M = M. Consider M asa 
A-module and form the [-module Te @, M. Then Te @, M = Te @, eM = 
Te @, (ef @r M) = (Te @g el) @r M=(TH+C)@rM=M+COE,M 
all as ['-modules. The first equality results from the following: Te ®@, M, = (0) 
where M, is the A-sub-module of M annihilated by e and Te @®, M = 
Te @ eM + Te @, M,. The other equalities follow from distributivity and 
associativity of the tensor product and from the assumptions of the theorem. 

Thus, under the assumptions of the theorem, we have shown that every 
indecomposable [-module M appears as a direct summand of a module of the 
form Te @, eM, where eM is a A-module (on which e acts as identity). 
Further, if eM = eM’ + eM” (A-direct sum) then Te @,eM = Te @, eM’ + 
Te ®, eM” (T-direct sum). Hence every indecomposable ['-module appears as 
a direct summand of some T'e ®, M where M is an indecomposable A-module. 
Note also that [Te @, M:K] <[Te:K]{|M:K] Thus if A is of finite 
(bounded) type so is I. 

The above proof could have been altered to use the module 


Hom, (Te,eM) instead of Te @, eM. 


The conditions of Theorem 4 are difficult to apply because the structure of 
the I'*-module Te @, eI" is complicated. We do, however, make use of this 
condition in the following: 


THEOREM 5. If = is a separable algebra over the field K, then A and A @x = 
are of the same representation type. 


Proof. By corollary 3, if A is of (strongly) unbounded type so is A @x =, 
regardless of >. 











42 J. P. JANS 


If A is of finite (bounded) type, we apply the condition of Theorem 4. Note 
that here the identity of A @ 1 = A is also the identity of A @x 2. From 
(2, LX, 7.10), a necessary and sufficient condition that = be separable is that 
the exact sequence 

(0) +> C+ = @x 2’ > = — (0) 


split, where p(a @ 8’) = af. This means that the =*-module 2 @x = (which 
is =*-isomorphic to 2 @x 2’) can be written as a E*-direct sum = + C. Now 
tensor-multiply over K the A‘-module A = A @, A with 2 @cg >= 2+C 
to obtain the A* @x =*-modules 


(A @a A) Ox lOc P= ASxcr+ A @xC. 


But A* @x =* = I“, the left hand side of the above equation is I*-isomorphic 
with (A @x« 2) @, (A @x =) or T @, T, and the right hand side is I*-iso- 
morphic with T + A @x C. The condition of Theorem 4 is satisfied and T is 
of finite (bounded) type. 


3. Fields. In the following we consider the representation type of the 
tensor product of two fields. Algebras obtained in this way are commutative. 
In (6), it is shown that for A a commutative K-algebra, [A : K] < @ , where 
N is the radical of A and A/N is a direct sum of fields isomorphic to K, A is of 
finite type if and only if A is the direct sum of ideals A, where A, = K[X]/(X‘i), 
X an indeterminant over K. If K is infinite and A is not of finite type then A 
is of strongly unbounded type. 

In the following, the degree of any containing field over the base field is 
always finite. 


LemMaA 1. If J is a field purely inseparable over the field K and J cannot be 
obtained from K by the adjunction of a single element then J @x J is of strongly 
unbounded type. If J = K(a) then J @x J is of finite type. 


Proof. In either case we have the mapping p : J @x J — J, p(a @ 8) = af 
which gives rise to the exact sequence 


0—-N-J@xJ—-J-0. 
The ideal N, generated by elements of the form = a @ 1 — 1 @ a, is the 
radical of J @x J because 
n” =0, [J:K] = ?*, 
p the characteristic of K. Thus J @x J, considered as a J-algebra is of the 
form considered in (6). 
If J is not obtained by a single adjunction then K and J are infinite (1, 
Theorem 26) and for each a in J there exists b < a such that a” belongs to 


K. Let m be the largest of these 6’s, m strictly < a. Thus [V : J] = p* — 1 
and for every n in N, 





=> me. Soe 





aB 


the 





REPRESENTATION TYPES 43 


Hence J @x J cannot be isomorphic to 
J[X]/(X”") 


so it is of strongly unbounded type. J @ x J is also of strongly unbounded type 
when considered as a K-algebra. 

In the case J = K(a), the radical element a @ 1 — 1 @ a = n, non-zero 
powers of which are linearly independent, generates NV. For the powers a’(r = 
1,...,*) form a K-basis for J and the elements a‘ @ a’(i, 7 = 1,..., p*) 
form a K-basis for J @x J. But 


nw” = (a@1—1@a)"" =a" ' @1+*+1@0"", 


where * indicates a sum of terms of the form fa‘ @ a’ with both i and j greater 
than 1. This is not zero so in this case 


J @ «J = J[X]/(X”) 


and considered as a J-algebra is of finite type. Both K and J are subalgebras 
in the centre of J @x J and any J-linear transformation is also K-linear so 
J @x J considered as a K-algebra is also of finite type. 

Using Lemma 1 and the structure theory for fields, we may drop the purely 
inseparable condition. 


LemMA 2. If F can be obtained from K by a single adjunction then F @x F is 
of finite type. If not, F @x F is of strongly unbounded type. 


Proof. Let J be the field of elements of F purely inseparable over K, then F 
is separable over J, J purely inseparable over K. F can be obtained by a single 
algebraic adjunction if and only if J can be so obtained. This is seen by using 
(1, Theorem 26) and the fact that every field between F and K is the unique 
composite of a field separable over K and a field purely inseparable over K 
(that is, between J and K). 

Hence by Lemma 1 J @ x J is of strongly unbounded type if F is not ob- 
tainable by a single adjunction, of finite type if F can be so obtained. But by 
Theorem 5, J @x«J, (J @xJ)@;F=J@xF, and F@;(J @x F) = 
F @x F are all of the same representation type because F is separable over J. 

Combining Lemmas 1 and 2, we prove the following: 


THEorEM 6. Let Q>L,F > K,LO\ F = J all be fields. If L, F are separ- 
able over J and J is obtained from K by a single adjunction then L @x F 1s of 
finite type. If J is not obtained from K by a single adjunction then L @x F is of 
strongly unbounded type. 


Proof. In the first case J @x J is of finite type by Lemma 2. Using Theorem 
5, the algebras J @x J, (J @xJ) @s F=J@xF and L@; (J @x F) = 
L @x F are all of the same (that is, finite) type because L and F are assumed 
separable over J. 














44 J. P. JANS 


In the second case J @x J is of strongly unbounded type by Lemma 2, thus 
so are the algebras J @x F and L @x F by Corollary 2. 

We believe that Theorem 6 could be sharpened to read “LL @x F and 
J @x J are of the same representation type,”’ however, in the case J @x J is of 
finite type, the condition of Theorem 5 is too weak to be used without addi- 
tional assumptions on L and F. 


REFERENCES 


1. E. Artin, Galois Theory, Notre Dame Mathematical Lectures, 2 (1944). 

2. H. Cartan and S. Eilenberg, Homological Algebra (Princeton, 1956). 

3. D. G. Higman, Indecomposable representations at characteristic p, Duke Math. J., 21 (1954), 
377-381. 

, Induced and produced modules, Can. J. Math., 7 (1955), 490-508. 

5. N. Jacobson, Lectures in Abstract Algebra (New York, 1951). 

6. J. P. Jans, On the indecomposable representations of algebras (forthcoming in the Annals of 
Mathematics). 

7. T. Yoshii, Note on algebras of strongly unbounded representation type, Proc. Japan Acad., 
82 (1956), 383-387. 





Ohio State University 
and 
University of Washington 








us 


nd 
of 
li- 


4), 





GOURSAT’S THEOREM AND THE 
ZASSENHAUS LEMMA 


JOACHIM LAMBEK 


1. Introduction. In this paper we study generalized homomorphisms 
between two algebras, namely the binary relations whose graphs are sub- 
algebras of the direct product of the given algebras.' In 1897 Goursat proved 
that every subgroup of the direct product of two groups is determined by an iso- 
morphism between factor groups of subgroups of the given groups (10, §§11, 12; 
25, pp. 15, 16). A like result is here shown for a general class of algebras, 
including loops and quasigroups, by a method due to Riguet (22). This result 
is used to obtain general forms of the Zassenhaus lemma and the Jordan- 
Hdlder-Schreier theorem for normal series (26, §9). It is also shown how 
Goldie’s generalization (8) of the latter may be derived by these methods. 

For easier reading, all results are first proved for groups in §2. Although the 
results for groups are not new, except proposition 2, the proofs given here 
carry over without change to the class of algebras considered in §3. It is 
difficult to judge from the extensive literature whether the J.H.S. theorem for 
normal series has previously been extended to quasigroups in the present form, 
because most authors on loops and quasigroups (for example, 1, 20) do not 
count division among the operations; the first to do so was apparently Evans 
(7). The extension of these results to systems with partial and infinitary 
operations is discussed in §4. 

To introduce our notation, we briefly review some concepts from the calculus 
of binary relations. A binary relation between two sets A and B is a triple 
p = (R, A, B), where R is a subset of the Cartesian product A X B, called the 
graph of p. One usually writes apb to mean (a,b) € R. Relations of special 
interest are the identity relation 4 on A, the converse p- = (R~-,B,A) of p and 
the relative product ps = (RS,A,C) of panda = (S,B,C). These are defined by 


1.1 aya’oa=a’ € A, 
1.2 bp a «> apb, 
1.3 apoc <> apb and bec for some b € B. 


We write p < p’ = (R’,A,B) if Risa subset of R’. 
One may think of the relation p = (R,A,B) as a many-valued mapping of 
part of A into B and say that 








Received September 28, 1956; in revised form June 15, 1957. This paper was written while 
the author held a summer research associateship of the Canadian National Research Council. 
‘It was pointed out to me by G. D. Findlay that ordinary homomorphisms have this 


property. 
45 











46 JOACHIM LAMBEK 


1.4 p is universally defined > t4 < pp , 
1.5 p is onto — tp <p p, 
1.6 p is faithful — pp < ta, 
1.7 pis single-valued pps tw. 
If «x = (K,A,A), one says that 

1.8 k is symmetric K <k, 
1.9 x is transitive > Kk KK, 
1.10 x is reflexive uu <x. 


An equivalence relation satisfies all of these three; but relations which are 
merely symmetric and transitive are also of interest. Any such relation satisfies 


K = «, KK > K 
and has the same graph as an equivalence relation on a subset of A. The 
following definition is due to Riguet (22). 
1.11 p 1s difunctional > pp p < p. 


This is easily seen to imply pp-p = p and means that whenever apb’, a’ pb’ and 
a’pb then apb. Riguet showed that such a p determines a one-to-one corres- 


pondence between equivalence classes on subsets of A and B respectively. 
We shall write 


ap = {b| apd}; 
more generally, for any subset A’ of A, 
A'p = {b\apb for some a € A’}. 


In particular, Ap is the range of p, Bp~ is its domain. The following rules are 
well known and will be used freely: 
(po )r - p(er), 
Ptp = p = tap, 
ta = ta, (0) =~, (pc) = op, 
A'(pc) = (A’p)e. 


We often take advantage of the first and last of these to write without brackets 
por and A’po. 


If x = (K,A,A) is any binary relation such that x < xx = x’, then x* < «**! 
for all 2 > 1. The union «x* = (K*,A,A) of all «* is called the transitive closure 
of x, since x—>«* is a closure operation and «x = «* if and only if « is transitive. 


One easily verifies that 

1.12 a*x* = x*x = «* a* =x *, 

For any binary relation p = (R,A,B), Riguet (23) defines its difunctional closure 
1.13 p’ = (op )*p = p(p p)*. 


p — p* is again a closure operation and p = p* if and only if p is difunctional. 








re 
es 


re 


‘ts 


+1 





LEMMAS OF GOURSAT AND ZASSENHAUS 47 


2. Homomorphic relations between groups. To generalize the notion 
of a homomorphism of a group A into a group B, we call the binary relation 
p = (R,A,B) homomorphic if and only if 

(i) Ipl, 

(ii) if apb then a~'pd—', 

(iii) if apb and a’pb’ then aa’ pbb’. 
Clearly then, p is homomorphic if and only if its graph R is a subgroup of the 
direct product A X B. It is easily verified that the identity relation, the con- 
verse of a homomorphic relation and the relative product of two homomorphic 
relations are all homomorphic. One also verifies for any homomorphic 
p = (R,A,B) that if A’ is a subgroup of A then A'p is a subgroup of B. 

A homomorphic equivalence relation is usually called a congruence relation. 
We shall call subcongruence any homomorphic relation which is transitive and 
symmetric without necessarily being reflexive. If x = (K,A,A) is such a sub- 
congruence on A, it induces a congruence relation (K,Ax,Ax) on its range Ax. 
The factor group of Ax modulo « is usually written Ax/x, we shall call it a 
subfactor of A. We define & = (K,A,Ax/x) by 


2.1 ak(a’«) <> axa’, 


so that ak = ax. A simple calculation shows that 


2.2 ke = KK OR = these 
whence 
2.3 Kk Kk = taciay 


Note that % induces the well-known natural homomorphism (K,Ax,Ax/x). 
Proposition 1 (Riguet). If p = (R,A,B) is a difunctional homomorphic 
relation (between two groups) then 
(i) « = pp~ is a subcongruence of A with range Bp~, 
(ii) X= p-pisa subcongruence of B with range Ap, 
(iii) p induces an isomorphism yu between subfactors Ax/«x and Bd/d such 
that 
(ax)u(bd) tf and only if apb. 


Conversely, every isomorphism between subfactors of A and B is induced in this 
way. 
Proof. In view of 1.11 we have 
(pp )(pp ) = (pp p)p = pp, 
and anyway 
(ep ) = (0) p =p, 
hence pp~ is a subcongruence by 1.9 and 1.8. Moreover 


A(pp ) = (Ap)p CG Bp, 











48 JOACHIM LAMBEK 


and bp~ is empty unless b € Ap, hence 
2.4 App =Bp. 


This establishes (i), and by symmetry (ii). Let * = (K,A,Ax/x) and 
X = (L,B,BX/d) be defined by 2.1 and put 


2.5 u = K pi, 

then 

2.6 Be= tap, BRO = tace; 
for 


w= p ke pX = Xp pr 
= Xp pp pX = Xp pX = XAXK = ewan, 
by 2.5, 2.2, (i), 1.11, (ii) and 2.3. Hence yu is an isomorphism between Ax/x 
and Bd/x, by 1.4 to 1.7. A further calculation shows that 
2.7 mur = p. 
This gives a ‘canonical decomposition” (18) of p and is paraphrased by (iii). 
Conversely, let u be a given isomorphism between given subfactors Ax/« and 


Bx/x of A and B respectively. If p is defined by 2.7, a computation will show 
that 


pp =k, pp=rX, pp p = p. 
This completes the proof of Proposition 1, which may also be written more 


concisely thus: 


PROPOSITION 1’. Subfactors Ax/x and Bd/d of (groups) A and B respectively 
are isomorphic if and only if there exists a difunctional homomorphic relation 
p = (R,A,B) such that pp- = «x and p~p = X. 


The importance of the above derives from the following: 


PROPOSITION 2. Amy homomorphic relation p = (R,A,B) (between two 
groups) is difunctional. 


Proof. Write f3(x,y,z) = xy~'z, then 
(tf) Ss(x,y,y) = x, fs(y,y,2) = 2. 
Now let a, a’ € A and b, b’ € Band assume 
apb’, a’ pb’, a’ pb 
then 
a = f3(a,a’,a’) p f3(b’,b’,b) = b. 
Thus p is difunctional by 1.11. 


Propositions | and 2 together give Goursat’s characterization of the sub- 
groups of the direct product of two groups, since all such subgroups are graphs 





of 
tio 


gre 
fol 


an 


ha 


an 





LEMMAS OF GOURSAT AND ZASSENHAUS 49 


of homomorphic relations between the groups. We have seen that Proposi- 
tion 1 yields a characterization of all isomorphisms between subfactors of 
groups. Thus the isomorphism of Zassenhaus (26, p. 54) may be obtained as 
follows. 


PROPOSITION 3. If x and d are subcongruences of (a group) A, then xd induces 
an isomorphism between the subfactors of A modulo xdx and dx. 


Proof. By proposition 2, «xd is difunctional, hence by Proposition 1 we 
have subcongruences 


(xr) (KA) KAN k = KAK 


and 
(xX) (kA) = Ak KA = AKA, 


whose associated subfactors are isomorphic. 

Zassenhaus used this result in a somewhat different form to prove the J.H.S. 
theorem for normal series. For the purpose of generalization, it is of interest 
to see how this can be done using Proposition 3 in the present form. 

If C is a subgroup of A and x is a subcongruence of A such that C C Cx, 
we shall call « a subcongruence of A over C. Following Goldie (9), we call 


normal series from A’ to C’ any m-tuple x, . . . km of subcongruences of A over 
C such that 

2.8 A’ = Ax, Cx; = Axa,..., Crm—1 = Atm, Cim = C’. 

Ultimately we are only interested in the case A’ = A, C’ = C; the more com- 


prehensive definition is useful in the following. 


PROPOSITION 4. Jf \ is any subcongruence of (a group) A over (a subgroup) 
C then any normal series of subcongruences of A over C from A’ to C’ gives rise to 
a normal series from A‘) to C’d. 


Proof. lf p = (R,A,B) is difunctional and A» and Bo are subgroups of 
A and B respectively such that By © Aop and Ao C Bop then 
2.9 Ap = Bop , Bop p = Aop, 
since for instance 
Ap © Bop p © Aopp p = Aop 


by 1.11. If « and \ are both subcongruences of A over C, it follows from 2.9, 
by taking B = A, By = Ao = Cand p = xd or Xx that 


2.10 Cxrx = Crwx, CrKxrd = Cer. 
It follows similarly from 2.4 with B = A that 
2.11 Axdxx = Adx, AXAkA = Axi. 











50 JOACHIM LAMBEK 


Using 2.8, 2.10 and 2.11 we compute 


A'xr =m AxiAr = Axx, 
CrK A = Cx r = Axiir = Ak u41A (2 = l, oo R= 1), 
Cr(mmrA = Cemr = CR, 


thus establishing the analogue of 2.8 which completes the proof. 


In view of Propositions 3 and 4, we may now state the Jordan-Hdlder- 
Schreier-Zassenhaus theorem in the following form: 


PROPOSITION 5. If i,... Km Gnd di,...A_ are normal series from A to C 
then the rectangular array {kid «;} i<m,j<n May be ordered by rows to give a “‘refine- 
ment’’ of the former, the array {Aj} i<m.j<n May be ordered by columns to give 
a refinement of the latter, and corresponding entries of the two arrays determine 
isomorphic subfactors. 


3. Generalization to other algebraic systems. By an n-ary operation 
f, on a set A is understood a mapping which assigns to each n-tuple of 
elements of A a single element of A, m being some finite non-negative integer. 
In particular, a 0-ary operation is a constant. We thus consider operations 
which are finitary, universally defined and single-valued. 

Let F be a set of operation symbols with prescribed subscripts. An algebra, 
in the sense of Birkhoff (3), is a representation of such a set of symbols as 
n-ary operations on a set A, and may be denoted by ;A. If A’ isa subset of A 
closed under all the operations in F, the induced representation -A’ is called a 
subalgebra of A. Two algebras -A and gB are called similar if F = G; if no 
confusion is likely to arise, they may be denoted merely by A and B. The 
Cartesian product A X B of two similar algebras is turned into another 
algebra of the same kind, called the direct product of A and B, by the familiar 
device, here illustrated by a unary operation, 


fila, b) = (fia, fib) (a € A,b € B). 


We define homomorphic relations between A and B as binary relations whose 
graphs are subalgebras of A X B. In particular, a homomorphism satisfies 1.4 
and 1.7, an isomorphism also 1.5 and 1.6; a subcongruence satisfies 1.8 and 1.9, 
a congruence also 1.10. Factors and subfactors are defined as for groups. 

Compound operations on an algebra are obtained by composition from the 
given operations and the selection operations J,,: 


Tne(X1, ~~ + Xn) = Xe (l<gqk<n). 


Mal’cev has introduced the notion of a primitive class of algebras (19). This 
is a maximal class of similar algebras subject to a given set of postulates 
expressed as identities between compound operations. For instance, the primi- 
tive class of groups consists of all sets with three operations 


fe=1, fx= er fo(x, y) = x-y, 








f- 





LEMMAS OF GOURSAT AND ZASSENHAUS 51 


subject to the postulates 


fa(fo(x,y),2) = falx.foly,2)), 
S2(fo,x) =z = fa(x,fo), 
falxfix) = fo = fol fix,x). 


The proofs of Propositions 1 to 5 were purposely stated in such a way that 
they remain valid, with minor modification of terminology, when the primitive 
class of groups is replaced by other primitive classes of algebras, as follows. 


THEOREM |. Proposition 1 holds for any primitive class of algebras, proposi- 
tions 2 to 5 hold for any primitive class with a compound operation f; satisfying 


(T) fs(x.v.y) = x, faly.y,2) = 2. 


This result applies to loops and generalized loops, as seen from the first two 
of the following examples. 


Example 1. Smiley (24) has considered what might be called /eft-loops, with 
operations 1, -,/ subject to 


(s/y):y= 2, (x-y)\/y=x, ly =y. 
Taking f3(x,y,z) = (x/y)-2, we easily verify (f) (see Proposition 2). Every 
left-loop A has a one-element subalgebra Ay = {1}. It is easily verified that 
every subcongruence « of A is uniquely determined by its range Ax« and its 
kernel Aox. In fact, axa’ if and only if a,a’ € A and a/a’ € Ao, that is 
a € (Aox)-a’. The subfactor Ax/x may then be denoted by Ax/A ox, to conform 
with the usual notation in group theory. If p = (R,A,B) is any homomorphic 
relation between left-loops, the subcongruence pp~ gives rise to the subfactor 


App _ Bp 


App Bop 


’ 


in view of 2.4 and 2.9. Hence, by Theorem I, any homomorphic relation 
p = (R,A,B) between left-loops induces an isomorphism between 


and 


The Zassenhaus lemma is easily seen to take the familiar form: Jf 
U 
U’ 

and 
V 











52 JOACHIM LAMBEK 


are subfactors of a left-loop then 
U"-UNV) ~ V’-(VOU) 
U'-(UN\ V’)~ V’-(VN UU’) 
Example 2. A quasigroup is a system with operations -, /,u subject to 
(z/y)-y = 2, (x-y)/y = x, y-(yuz) = 2, yu(y-x) = x. 
Taking fs(x, y, z) = (x-(yuy))/(zuy), we easily verify (f).? 


E ample 3. A relatively complemented lattice (2, p. 105) is a lattice in which 
to every element a / 6 between a and a  b Uc there corresponds a so-called 
relative complement g;(a, 6, c) such that 


(a U 6b) U g3(a,b,c) =aUbUc, 
(a U 6) 1) g3(a,b,c) = a. 


If we reckon g; among the operations of a relatively complemented lattice, we 
may put f;(a, b,c) = gs(a, b, c) (\ gs(c, b, a) and verify (Tt) by computation.’ 
Birkhoff (2, p. 89) has proved the J.H.S.Z. theorem for principal series for 

algebras whose congruence relations permute, that is, where «A = A« for any 
two congruence relations « and X. It is therefore of interest that the following 
three statements about a primitive class of algebras are equivalent: 

M1. There is a compound operation f; satisfying (fT). 

M2. All homomorphic relations are difunctional. 

M3. All pairs of congruence relations on the same algebra permute. 
In fact, M1 implies M2 by Theorem I, M2 implies M3, because 


KX = exAXt < AwAK = (Ax)(Ax) (Ax) = Ax, 


by 1.10 and 1.11, and symmetrically \« < «\. Finally, Mal’cev has shown that 
M3 implies M1, by an ingenious argument involving the free algebra with 
three generators in the primitive class. 

Goldie (8) has generalized the J.H.S.Z. theorem for normal series to a larger 
class of algebras satisfying a condition of ‘‘weak associability,’’ which is equiva- 
lent to our 2.10. We briefly indicate how Goldie’s result can be deduced from 
proposition 1. 

If A is an algebra with finitary operations then the union of any increasing 
sequence of subalgebras is also a subalgebra, as is well known.‘ In particular, 
if x = (K, A, A) is a homomorphic relation such that «x < «x = «*, then its 
transitive closure «* is also homomorphic. If A and B are similar algebras with 
finitary operations, it follows that the difunctional closure p+ of a homomorphic 
relation p, defined by 1.13, is also homomorphic. Writing p* for p in Proposition 
1, we find that any homomorphic p gives rise to two subcongruences 


*Essentially this formula appears in the paper by Mal’cev (19). 

*This construction of f; is implicit in the proof by Dilworth (4) that congruence relations 
on a relatively complemented lattice permute. 

‘See for instance the proof by Kurosh (15, p. 48) for groups. 





SU 


Ck 





LEMMAS OF GOURSAT AND ZASSENHAUS 53 


3.1 pp’ = (pp )*, pp = (p p)*, 

whose associated subfactors are isomorphic. (Equation 3.1 follows from 1.13 
and 1.12.) In place of Proposition 3, we find that, given subcongruences « and \, 
their relative product «A induces an isomorphism between the subfactors 
belonging to the subcongruences (xAx)* and (AxA)*. Modified Propositions 4 
and 5 for normal series from A to C can then be deduced from 2.10, which is 
now postulated, with little more trouble than for groups. We may ask what 
class of algebras satisfies 2.10. 


PROPOSITION 6. Let A be an algebra with a compound operation f; and a 
subalgebra C such that 


filacjc) =a, fs(aa,c) € C, 
for alla € A,c € C. If x = (K, A, A) is any homomorphic relation such that 
C C Cx then 
3.2 Cen = Cx. 
Proof. Let a € Cxx~, then cxa and axa’ for some c € C and a’ € A. Now 
cxc’ for some c’ € C, since CG C Cx. Hence 


a = f;(a,c,c) x f;(a’,a’,c’) = ce’ EC, 


so that Cxx- C Cx-. The converse follows immediately from C C Cx. Note 
that 2.10 can be derived from 3.2 as from 2.9. 


Example 4. Let us call left-quasigroup any system A with two binary 
operations - and / satisfying 


(z/y)-y = 2, (x-y)/y = x. 


Let f(x, y, z) = (x/y)-2, then f;(x, y, y) = x. Givena € A, call a/aa left-unit, 
and let C be the subalgebra generated by the left-units, then for any c € C 
we have f(a, a,c) = (a/a)-c € C.° 


Example 5. Consider a system A with 1 and / satisfying 
/x=1, x/l=x. 


Let C = {1}, fs(x, y, 2) = (x/y)/z or merely x/y, and verify that 
fs(a,a, 1) = 1,fs(a,1,1) =a. 


4. Discussion of further generalizations. In §3 we demanded that 
the operations of an algebra be finitary, universally defined and single-valued. 
We shall briefly discuss what happens when one or more of these restrictions 
are relaxed. 

By an infinitary operation on A is usually understood a mapping which 
assigns a value in A to any sequence of elements in A, a sequence being a 





S5Murdoch (20) has carried out a somewhat similar construction for certain quasigroups, 
without counting division among the operations. 











54+ JOACHIM LAMBEK 


mapping of the set J of natural numbers into A. Looking back at the proofs 
of propositions 1 to 5, we find that no use has been made of the fact that the 
operations in a group are finitary, except in the definition of compound opera- 
tions and therefore in the concept of primitive class. Now the only property 
of a primitive class used here is the possibility of making certain constructions 
without going outside the class. We thus have 


THEOREM II. Theorem I remains valid for a class of similar algebras with some 
infinitary operations, provided the class contains all subalgebras, factors and direct 
products of algebras in the class. 


It seems that Goldie’s generalization cannot be thus extended, as long as it 
depends on the fact, not in general valid for algebras with infinitary operations, 
that the union of an increasing sequence of subalgebras is also a subalgebra. 


Example 6. A Boolean o-algebra (2, p. 167) is a Boolean algebra with an 
infinitary operation “‘sup”’ such that for any sequence a of elements la, 2a, .. . 
and any element a, sup a < a if and only if ia < a for alli € J. If + denotes 
the so-called symmetric difference, we write f;(x,y,z) =x+y+s2 and 
verify (f). This example may be generalized to relatively complemented o- 
lattices, in view of Example 3. 

The familiar limit operation of analysis is a partial operation, in the sense 
that it is defined only for some sequences, which are called convergent. We shall 
use it here to illustrate arbitrary partial operations, be they finitary or infini- 
tary, availing ourselves of the ready-made terminology that goes with the 
limit concept. 

We consider an algebra A with a partial infinitary operation “‘lim.”” No 
connection is assumed between the algebraic structure of A and its ‘‘topo- 
logical” structure. Interest centres on subalgebras of A which are closed, that 
is, closed under “‘lim,’’ and on homomorphic relations with closed graphs. But 
even if p is an ordinary homomorphism of A into B with closed graph, B’ a 
closed subalgebra of B, it is not in general true that B’p~ is closed, unless p is 
continuous. Moreover, in extending results of group theory to groups with a 
limit operation, one aims to make sure that all relevant isomorphisms are 
bicontinuous. Of interest are therefore classes of similar algebras with the so- 
called closed-graph property: Every isomorphism between two members of the class 
is continuous tf it has a closed graph. 


Example 7. Given any sequence a whose set of terms is Ja = {la, 2a, ...}, 
a subsequence has the form oa, where ¢ is a mapping of J into J. Kuratowski 
(14, p. 84) imposes the following postulates on the operation “‘lim.”’ 

Kl. Jf lim a = a then lim ca = a. 

K2. If ta = a for alli € I then lim a = a. 

K3. If every subsequence ca of a has a subsequence roa such that lim roa ="a 
then lim a = a. 





Wr 


we 





LEMMAS OF GOURSAT AND ZASSENHAUS 


on 
on 


We make no use of K2, instead we demand compactness: 

K4. Every sequence has a convergent subsequence. 

It is now easily shown that if A and B are sets with a limit operation satisfying 
Ki, K3 and K4, then any one-to-one correspondence between A and B is 
continuous if it has a closed graph. 

Theorem II can be extended to such systems as loops and quasigroups with 
a limit operation satisfying K1, K3 and K4. We refrain from carrying out this 
extension here, since the study of infinite sequences should really be replaced 
by that of mets (11, II). It is known that the class of locally compact quasi- 
groups with countable base enjoys the closed-graph property (19, theorem 
12). This suggests further extension of our results. 

In view of the extensive literature on multigroups and related systems (for 
example, 6; 13), we should say a word about them. A many-valued operation 
assigns non-empty subsets of A to n-tuples of elements of A. The following 
example may give a clue to the extension of the present results to systems with 
many valued operations. 


Example 8. Kuntzmann (13) introduces a multiform system with three many- 
valued operations -, /, u subject to 


c €abea €c/bc EC abeod € auc. 


Writing f:(x,y,z) = (x-(yuy))/(zuy), in the sense of operations on complexes, 
we easily verify that 


x € fa(x, y,%), z € faly, y, 2). 








wr 


25. 


26. 





JOACHIM LAMBEK 


REFERENCES 


. G. E. Bates and F. Kiokemeister, A note on homomorphic mappings of quasigroups etc., 


Bull. Amer. Math. Soc., 54 (1948), 1180-1185. 


. G. Birkhoff, Lattice Theory (rev. ed., New York, 1948). 





, Universal algebra, Proc. lst Can. Math. Congr., (1949), 310-326. 


. R. P. Dilworth, The structure of relatively complemented lattices, Ann. Math., 51 (1950), 


348-359. 


. P. Dubreil et M.-L. Dubreil-Jacotin, Théorie algébrique des relations d'équivalence, J. de 


Math., 104 (1939), 63-95. 


. M. Dresher and O. Ore, Theory of multigroups, Amer. J. Math., 60 (1938), 705-753. 
. T. Evans, Homomorphisms of non-associative systems, J}. Lond. Math. Soc., 24 (1949), 


254-260. 


. A. W. Goldie, The Jordan-Hélder-Schreier theorem for general abstract algebras, Proc. 


London Math. Soc., 52 (1950), 107-131. 
, The scope of the Jordan-Hélder-Schreier theorem in abstract algebra, Proc. London 
Math. Soc. (3), 2 (1952), 349-368. 





. E. Goursat, Sur les substitutions orthogonales etc., Ann. sci. éc. norm. sup. (3), 6 (1889), 


9-102. 


. J. L. Kelley, General Topology (New York, 1955). 
. F. Kiokemeister, A theory of normality for quasigroups, Amer. J. Math. 70 (1948), 99-106. 
. J. Kuntzmann, Contributions a l'étude des systémes multiformes, Ann. Fac. Sci. Univ. 


Toulouse (4) 3 (1939), 155-193. 


. C. Kuratowski, Topologie | (Warszawa, 1952). 

. A. G. Kurosh, The Theory of Groups | (New York, 1955). 

. J. Lambek, Groups and herds, abstract 145t, Bull. Amer. Math. Soc., 61 (1955), 58. 

. P. Lorenzen, Ueber die Korrespondenzen einer Struktur, Math. Zeitschr., 60 (1954), 61-65. 
. S. MacLane, Duality for groups, Bull. Amer. Math. Soc., 56 (1950), 485-516. 

. A. I. Mal’cev, On the general theory of algebraic systems, Mat. Sbornik N.S., 35 (1954), 


3-20. 


. D. C. Murdoch, Quasigroups which satisfy certain generalized associative laws, Amer. J. 


Math., 61 (1939), 509-522. 


: L. Pontrjagin, Topological Groups (Princeton, 1946). 
. J. Riguet, Relations binaires etc., Bull. Soc. Math. France, 76 (1948), 114-155. 





, Quelques propriétés des relations difonctionelles, C.R. Acad. Sci. Paris, 230 (1950), 
1999-2000. 


. M. F. Smiley, Notes on left divisor systems with left units, Amer. J. Math., 74 (1952), 679- 


682 


W. Threlfall and H. Seifert, Bewegungsgruppen des dreidimensionalen sphdrischen Raumes, 
Math. Ann., 104 (1931), 1-70. 
H. Zassenhaus, The Theory of Groups (New York, 1949). 











THE TERM RANK OF A MATRIX 
H. J. RYSER 


1. Introduction. This paper continues a study appearing in (5) of the 
combinatorial properties of a matrix A of m rows and n columns, all of whose 
entries are 0’s and 1’s. Let the sum of row i of A be denoted by r, and let the 
sum of column i of A be noted by s;. We call R = (ri,...,1%m) the row sum 
vector and S = (s;,..., S,) the column sum vector of A. The vectors R and S 
determine a class % consisting of all (0, 1)-matrices of m rows and m columns, 
with row sum vector R and column sum vector S. Simple arithmetic properties 
of R and S are necessary and sufficient for the existence of a class & (1; 5). 

Let 6, = (1,...,1,0,...,0) be a vector of m components, with 1’s in the 
first r; positions, and 0’s elsewhere. A matrix of the form 


5m 
is called maximal, and A is called the maximal form of A. Note that A is formed 
from A by a rearrangement of the 1’s in the rows of A. It is clear that for A 


maximal, the class & contains the single entry A. 
Consider the 2 by 2 submatrices of A of the types 


1 0 01 
a=|} | and a.=|9 1. 


An interchange is a transformation of the elements of A that changes a minor 
of type A, into type A», or vice versa, and leaves all other elements of A 
unaltered. The interchange theorem (5) asserts that if A and A* are arbitrary 
in &, then A is transformable into A* by a finite sequence of interchanges. 

The term rank p of A is the order of the greatest minor of A with a non-zero 
term in its determinant expansion (4). This integer is also equal to the minimal 
number of rows and columns that collectively contain all the non-zero elements 
of A (3). Let # be the minimal and j the maximal term rank for the matrices 
in &. The interchange theorem (5) implies the existence of an A in & of term 
rank p, an arbitrary integer in the interval p < p < f. In what follows we 
derive a simple formula for ~ and study further combinatorial consequences 
of the term rank concept. 


2. The maximal term rank. Let % be the class of all (0, 1)-matrices A 
of m rows and m columns, with row sum vector R and column sum vector S. 





Received March 8, 1957. This work was sponsored in part by the Office of Ordnance Research. 
57 














58 H. J. RYSER 


We suppose throughout that the components of the row sum vector R and 
column sum vector S of A are positive. This is no genuine restriction on A in 
the study of term rank. We proceed to evaluate j, the maximal term rank for 
the matrices in YW. 

For this purpose, let R’ = (r; — 1,...,1% — 1), where r; — 1 > 0. Let 
A’ be the maximal matrix of m rows and n columns having row sum vector 
R’, and let the column sum vector of A’ equal 


8’ = OW os seh 


Note that if A is the maximal form of A and if the column sum vector of 4 


is (3:,...,%,), then 3 = 84, (¢ = 1,..., m — 1) and &,’ = 0. Renumber 
the subscripts of the column sum vector S = (s;,...,5,) of A so that 
$1 > eee > Sn» 


and define the integers s,’ > 0 by 


Finally, let 


THEOREM 2.1. Let 6 equal the maximal term rank for the matrices in U. Let 
M equal the largest integer in the set 


k 
be (s,) — 3/) (k = 0,1,...,2). 


i=0 
Then 
B=>m-— M. 

Let A; be the m by m matrix with maximal term rank p. Without loss of 
generality, we may assume that the row sum vector R = (ri, ro,..., fm) and 
column sum vector S = (5s;, S2,...,5,) of Az satisfy 7: >... >7_, and 
$1 >... > Sa. We select a specified set of g 1’s of A; accounting for the 
maximal term rank and call them the essential 1’s of A;. All other 1’s of A; are 
then referred to as unessential. 

We derive two Lemmas. 


LeMMA 1. ForO0 <k <n, 
k 
DS (si - 81) Sma, 
t-_ 


Let B be formed from A; by replacing the j essential 1’s of A; by 0’s. We 
agree to write A; so that 
S$ >...->5n; OD... pons bg = Sete: —1. 


Here s,; and 5, denote the sums of column i of A; and B, respectively, and 





cc 
2 


Cc 


~~ -~, wah Cte ee 





2 =. tute 





TERM RANK OF A MATRIX 59 


column i of A; contains an essential 1 if and only if e, = 0. Note e, = + 1 for 
exactly m — 6 values of i. 
Let B be the maximal form of B, with column sums 6; >... > 6,. Then 
for each k,0 < k <n, 
k 


k k 
Yo si < > b< do by. 


From the definitions of the 3, and the 6,, 

k k 

u 5,’ + (m— 5) > a b,, 
whence 


k k 
 » <b 5,’ + (m — B). 


t=O t=O 


LemMMA 2. Let f be such that 0 < f < n and 
i 
Da (si — 5) = m —B. 


Then the matrix A; of maximal term rank 5 may upon permutations of rows and 
columns be writien in the form 


, i & -@ 
Pre E: 0 0 O 
7a * se 
* 0 0 0 


Here S is a matrix entirely of 1's of size e by f. The matrices E, and E, are 
square of orders e and f, respectively, I is an identity matrix of order g, with 
Bp =e+/f + g, and the 0's denote zero blocks. The § essential 1's of A; appear 
on the main diagonals of E;, Ex, and I. The degenerate cases e = 0 and g = 0 
are not excluded. 


Reading the inequalities of Lemma 1 as equalities, we obtain 


f f f f 
» s; = DL b= dL b= > 3 + (m — 8B). 


i=0 t_ 


This tells us that the matrix B may be written in the form 


»-[3 3] 


where S is the e by f matrix of 1's, and where the matrix X has at least one 
1 in each row. Now 


I f f 
> s/ = De si f= Le be 


i=0 














60 H. J. RYSER 


implies that essential 1’s occur in the first f columns of A;, and they may be 
placed on the main diagonal of Ep». 

The equation 

f f 

Ls = >> 5) +m—p+f 

on = 
implies that there are m — p + f rows of A; in which 0’s occur in each of the 
columns f + 1,...,m. Let e’ < e essential 1’s of A; occur in rows l,...,¢ 
of A;, and let g essential 1’s occur in rows e +f +1,...,m of A;. Then 
e+f+g=pand m—j+/f+g = m — e, whence e’ = e. Hence essential 
l’s occur in the first e rows of A;, and these may be placed on the main 
diagonal of EF. 

To prove Theorem 2.1 it suffices to establish the existence of a k = f for 
which equality holds in Lemma 1. The theorem is valid for m by 1 and 1 by 
n matrices. The induction hypothesis asserts the statement of the theorem for 
all matrices of size m — 1 by n’, with 1 < mn’ < n, and we shall prove the 
theorem for matrices of size m by mn. Moreover, if 5 = m, then 


So — &, = m — p = 0. 


Also, if 5 = n, then 


8 ” hn 

ps (si —8/) = yo ae > 1-m) = m3 
i=0 i=0 i=0 

Since the theorem is valid in each of these cases, we may assume that p < m 

and p < . 

In A; suppose that s; > s,. Then we may normalize the first row of A; in 
one of two ways. Either a; = 1 or, in the other case, a;, = 0 and a,, = 0 or 
1, with a,, = 1 an essential 1 of A;. For otherwise we must have a;; = 0 and 
ai; = 1, an unessential 1 of A;. But then there exists an unessential 1 of A; 
such that a,; = 1 and a,; = 0. We may then perform an interchange that 
does not affect the term rank and obtain a,; = 1 and a;, = 0. We agree to 
normalize the first row of A; to fulfill this requirement. 

Now delete row 1 from the normalized A; of maximal term rank §. Also 
delete any zero columns from the resulting (m — 1)-rowed matrix. We then 
obtain a matrix C of m — 1 rows and n’ columns, 1 < n’ < n. Let C belong to 
the class ©. The maximal term rank for the matrices in € equals j or 6 — 1. 

Suppose there exists a C’ of term rank j in ©. To C’ we may adjoin n — n’ 
columns of 0’s and the first row of A;, and thereby obtain a matrix A’ = [a,,’] 
in the class 2%. Now if'a;, = 1, where column i does not contain an essential 1 
of C’, then this contradicts the maximality of 6 in &. Suppose then that 
a;; = 0 for each column i that does not contain an essential 1 of C’. Since 
1; > r;, we may perform an interchange involving row 1 and some other row 
of A’ to obtain a;,/ = 1 for some column i not containing an essential 1 of C’. 
This again contradicts the maximality of s in &%. Hence we conclude that p — 1 








a 
wh 











TERM RANK OF A MATRIX 61 


is the maximal term rank for the matrices in €. This term rank is attained by 
C. The 6 — 1 essential 1’s of C plus one essential 1 from the first row of A; 
comprise the j essential 1's of A;. 
We permute the columns of C so that ¢ > ¢: >... > cy and apply the 
induction hypothesis to C. Then there exists an f,0 < f < mn’, such that 
f f 
p 3 cc; = > é,’ + (m — 8). 
t=O i=0 
We may suppose that 0 < f <n’. For if f = 0, then 6 = m and the theorem 
is valid. Also if f = m’, then p = n’ + 1. This implies that n’ <n. If n’=n—1, 
then 6 = m and the theorem is valid. Thus if f = m’, we may suppose that 
n’' <n — 2. But in this case the last n — n’ > 2 columns of A; must have 
1's in the first row, and only one of them can be essential. By the normalization 
process applied to A;, every column of A; headed by 0’s must have column 
sum equal to 1 and these columns occupy the last of the first »’ positions in 
A;. If such columns exist we may take a smaller value of f in C. If all of the 
columns of A; are headed by 1's, the theorem is valid for A; with f = n’. 
Thus we may suppose that 0 < f < nm’, and upon permutations of rows 
and columns, we may write the matrix C in the form given by Lemma 2: 


- a 2 © 
D. 0 O O 
i... *. te 
* S-@ @ 


Here S is the matrix of 1's of size e by f, and the orders of D,, D2, and J 
total ps — 1. The j — 1 essential 1’s of C appear on the main diagonals of 
D,, Dz, and J. The matrix J need not appear, but we may assume that e # 0. 
For if e = 0, we again obtain — 1=n’. 

We restore now to C the m — n’ zero columns, and finally a row of r; I's 
and m — r; 0's. We thereby obtain A, where A = [@,,] is the same as A; 
apart from possible row and column permutations. Suppose that @,, = 1 
(¢=1,...,f). Then 

f 
D (si — 3) = m — 8, 
t=O 
and the theorem follows. 
Suppose that on the other hand some @,, = 0, where 1 < j < f. If we per- 


mute the first f columns of A, then we may assume that 4, = 1 (i = 1,..., A) 
and that @,, = 0 (j = 4+1,...,f). The case A = 0 is not to be excluded. 
If h = 0, then 4, = 0 (j = 1,...,f). Now there must exist an essential 1 of 


the form 4, = 1 for some u, where u satisfies e+ f+1<u<n. If there 
does not exist an unessential 1 of the form 4, = 1, where v satisfies 
f+1<v<n, then 

f 


be (s;’ nite 5,’) =m — 6, 


i=0 














62 H. J. RYSER 


and the theorem is valid. Suppose then that one vr more unessential 1's exist 
of the form @;, = 1, where v satisfies f + 1 < v < m. We assert that then an 
unessential 1 cannot occur in the intersection of rows e + 2,..., mand columns 
h+1,...,f of A. For suppose that an unessential 1 appears in this position. 
Then by our normalization process, for each v associated with the unessential 
l’s of the form 4, = 1, f + 1 < v < n, we must have @,, = 1 (j= 1,..., 
e + 1). Furthermore, there must exist in each of these columns an essential 
1 of the form @,, = 1, for some ¢ satisfying e + f + 2 < t < m. All of the 
remaining entries of these columns must be 0. But consider now row 1 and 
row 2 of A. A1 in row 1 may appear directly above a 0 in row 2 only in the 
column of the essential 1 of the form @,, = 1. However, a 0 in row 1 must 
appear directly above a 1 in row 2 in at least two columns. But this contra- 
dicts the fact that the number of 1's in row 1 of A is greater than or equal to 
the number of 1's in row 2 of A. Thus an unessential 1 cannot occur in the 
intersection of rows e + 2,...,m and columns h + 1,...,f of A. Hence 
it follows that 
rh h 
TY ie s; =m — §. 


i=—0 i=0 


Note that the degenerate case h = 0 gives 6 = m. This completes the proof. 


3. Applications. In the following applications we continue to require 
positive components for the vectors R and S that determine the class 4. 
A (0, 1)-matrix A = [a,,] may be regarded as an incidence matrix distributing 
n elements x), ...,X, into m sets S;,...,5,. Here a,; = 1 or 0 according as 
x, is or is not in S; From this approach the term rank of a matrix generalizes 
the concept of a system of distinct representatives for subsets S,,...,5S,, of a 
finite set (2). The subsets S,,..., 5S, possess a system of distinct representa- 
tives if and only if the term rank of the associated incidence matrix satisfies 
p = m. In this case we say A possesses a system of distinct representatives. 


THEOREM 3.1. There exists an A in U possessing a system of distinct representa- 
tives if and only if 


> (s/ — 3/) <0 (k = 0,1,...,8). 


This is the special case of Theorem 2.1 with 6 = m. 
For a (0, 1)-matrix A, let No(A) denote the number of 0’s in A and let 
N,(A) denote the number of 1’s in A. 


THEOREM 3.2. Let A be in U and let p < m,n. Then upon permutations of 
rows and columns, A may be reduced to the form 


ink = 3) 
4 —_ Y » . 





Here 
For . 


In 


we | 
Lem 


But 


and 


an 


H. 








TERM RANK OF A MATRIX 63 


Here W is of size e byf (0<e<m,0<f <n) and No(W) + Ni(Z) =p— (e+). 
For A;, we have No(W) = 0 and N,(Z) = 6 — (e+). 


In the equation 
f 
~ (si) — 8) = m — 8, 
i=0 


we have 0 < f < n, for otherwise 5 = m or 6 = n. Also for the matrix A; of 
Lemma 2,0 < e < mand 


e f 
Lert De sta— (eth) — of = Ni(45). 
i= i= 
But 
e f 
> ret ai Sy = Ni(X) + Ni(Y) + 2N,(W) 
and 
Ni(W) + Ni(X) + Ni(Y) + Ni(Z) = Ni(A;). 
Hence 
ef — Ni(W) + Ni(Z) = B — (e +f) 
and 


No(W) + Ni(Z) = B — (e +f). 


Let A = [a,,] be in &. Suppose an element a,, = 1 of A is such that no 
sequence of interchanges applied to A replaces a,, = 1 by 0. Thena,, = 1 is 
called an invariant 1 of A. An analogous definition holds for an invariant 0. 


THEOREM 3.3. Let ay, be an invariant 1 of A. If A’ = [{a,,’] is in U, then a,,’ 
is an invariant 1 of A’. 

For if for some A* = [a,,*] in YW, a,,* = 0, then transforming A into A * 
by interchanges contradicts the hypothesis that a,, = 1 is an invariant 1 of A. 
Thus all or none of the matrices in & contains an invariant 1, and we refer 
to U1 as being with or without an invariant 1. 


THEOREM 3.4. Let A contain an invariant 1. Then by permutations of rows 
and columns, A may be reduced to the form 


| S X | 
, Oe 
Here S is the matrix of 1's and contains the invariant 1 of A. 


For by permutations of rows and columns we may reduce A to the following 
form: 











64 H. J. RYSER 





1 I 1/0 0 
1 S s* Ce R, 
§ . M 
= 0 
aint t R, N 0 
0 
Ci 0 0 
0 





Here the 1 in the (1, 1) position of A* is the invariant 1. The block in the lower 
right hand corner is then composed entirely of 0’s. We permute rows so that 
R, contains at least one 1 in each row, and then permute columns so that C; 
contains at least one 1 in each column. The intersection of the rows of A* 
containing R,; and the columns of A* containing C, is S, a matrix of 1’s. We 
now permute columns so that S* is a matrix of 1’s and C, contains at least 
one 0 in each column. Next we permute rows so that § is a matrix of 1’s and 
R, contains at least one 0 in each row. The intersection of the columns of A* 
containing Cy and the rows of A* containing R, is a zero matrix. If one or more 
of S*, Cy, 8, Ro do not appear, the theorem follows. If all appear, we replace 
M by a matrix of the form 
Le 
0 


and N by a matrix of the form [C,* 0], where R;* has at least one 1 in each row 
and C;* has at least one 1 in each column, and then continue as before. This 
procedure must terminate, and upon termination we obtain the matrix of the 
theorem. 

Note that X and Y may contain further invariant 1’s and the normalizing 
procedure may be applied to each of these blocks separately. Also, if A, X, 
and Y are of term ranks p, p;, and p,, respectively, and if S has size e’ by f’, then 


p= pr + py + min (e’ — pz, f’ — Py), 
whence 


p = min (e’ + p,, f’ + pz). 


THEOREM 3.5. If Wis without an invariant | and if 6 < m, n, then the minimal 
term rank p for the matrices in U must satisfy p < #. 








and ci 
Also * 
let A 

In. 
able i 


in §2 


PPP > 
— 
Y. 
oo 





TERM RANK OF A MATRIX 65 


In the matrix A; of Theorem 3.2, the 1 in the (1, 1) position is not invariant. 
But by Theorem 3.2, No(W) + Ni(Z) = 6 — (e +f). This means that there 
are matrices in & with fewer than p — (e + f) l’sin Z. Hence p < 3B. 

Note that Theorem 3.5 is not necessarily valid for 6 = m. For we may let 
m = n, and let & be the class of all (0, 1)-matrices with exactly k 1's in each row 
and column, 1 < k < m. Then & is without an invariant 1, but 6 = 5 = m (3). 
Also Theorem 3.5 need not hold for a class &{ with an invariant 1. For example, 
let A be maximal. Then A is the only matrix in YU, and we must have p = B. 

In conclusion, a deeper insight into the structure of would be of consider- 
able interest. An arithmetic formula for p analogous to the formula for j given 
in §2 would be especially desirable. 


REFERENCES 


. David Gale, A theorem on flows in networks, Pacific J. Math., 7 (1957), 1073-1082. 

. P. Hall, On representatives of subsets, J. Lond. Math. Soc., 10 (1935), 26-30. 

. Dénes Kénig, Theorie der endlichen und unendlichen Graphen (New York, 1950). 

. Oystein Ore, Graphs and matching theorems, Duke Math. J., 22 (1955), 625-639. 

. H. J. Ryser, Combinatorial properties of matrices of zeros and ones, Can. J. Math., 9 (1957), 
371-377. 


aftwn = 


Ohio State University 











ON THE HASSE-MINKOWSKI INVARIANT OF 
THE KRONECKER PRODUCT OF MATRICES 


MANOHAR N. VARTAK 


1. Introduction. Let R = (r;,;) be an m X n matrix and let S = (s,) be 
a p X g matrix defined over a field F. The Kronecker product R X S of R and 
S is defined as follows: 


Definition 1.1. The Kronecker product R X S of the matrices R and S is 
given by 


run S oe 
11 RXS= 1m S 7 

m1 5 a eee S 
where 7,;,S; i = 1,2,...,m;j7 = 1,2,...,m, is itself a p X q matrix (1, 
69-70). 


We shall always use the symbol “‘ X”’ in a product of matrices to denote the 
Kronecker product. The ordinary product of R and S (whenever it exists) will 
be denoted by R-S or RS. 

The Hasse-Minkowski invariant is a number-theoretic function occurring 
in the arithmetical theory of quadratic forms. With respect to the matrix 
A = (a;;) of a quadratic form 


O= bo aux. xy, 


i, j=l 
it is defined as follows: 


Definition 1.2. Let A be an m X nm non-singular symmetric matrix with 
rational elements and let D,(i = 1,2,...,m) denote the leading principal 
minor determinant of order i in the matrix A. Suppose further that none of the 
D, is zero. Then the integer 


n—l1 


1.2 Gy = &(A) = (— 1, — Da) T] Du — Dert)s 
is called the Hasse-Minkowski invariant of A where p is a prime and (a, 5), is 
the Hilbert norm residue symbol (2). 

From the properties of the Hilbert norm residue symbol we get the following 
expressions for c,(A) equivalent to 1.2: 


13 ithe ~<a oo -ma 


Received June 27, 1957. 
66 








wher 


1.4 
Th 


(A 

and J 
In 

(A 


consi 


2. 
folloy 

LE 
order 
Kron 


2.1 


where 
mino 
deters 

Pr 


one a 
and ( 


2.2 


wher 
colun 
rows, 
colun 


ON THE HASSE-MINKOWSKI INVARIANT 67 


where Dy = 1, and 
n—1l 

1.4 (A) = (— 1, — Dy), IT (Diss, — Dy)p. 
t_ 


The problem considered in this paper is that of obtaining the value of 
¢(A X B) in terms of c,(A), c,(B) and the determinants |A| and |B] of A 
and B. 

In the next section we prove a theorem giving the exact relation between 
¢y(A X B) on one side and c,(A), c,(B), |A| and |B| on the other. In §3 are 
considered some particular cases of this result. 


2. The Hasse-Minkowski invariant c,(A X B): We shall first prove the 
following Lemma: 


LEMMA 2.1. If A is a square matrix of order m and B is a square matrix of 
order n, then the leading principal minor determinant |D,| of order u in the 
Kronecker product A X B is given by 


2.1 |Dy| = |A,|""*|A r41|"|Bl'|B,| 


where m+s=u;0<r<m;0 <5 <n; |A,| denotes the leading principal 
minor determinant of order r in A, \B,\ denotes the leading principal minor 
determinant of order s in B, and none of the determinants |A ,| is zero. 





Proof. In the first place observe that for any given u, 1 < u < mn, there is 
one and only one pair of integers r and s such that rn + s = uwithO <r < m 
and 0 < s < n. We therefore have: 


33 a;,B 4B ..a,B Pe 
dB AB . d2,B 6c” 
ID.|=| ... ut eh 
4B aB — 6.000" 
Or41,1Bos) Or41,2Bes) - - - rot, rBes) O41, 041Bi0) 


where B“ is the m X s matrix obtained from B by deleting the last (m — s) 
columns, B,,) is the s Xm matrix obtained from B by deleting the last (m — s) 
rows, and Bi} is the s X s matrix obtained from B by deleting the last (m — s) 








columns and (m — s) rows. From (2.2) we have, since |A;| = 





Onn 


() 
a22 


(1) 
a%B 


(1) 
a2 


() «) 
@+41,2B s) 0741,3B(s)--- 


Onxn 


(1) 
a8 eee 


(1) 
a8 see 


(1) 
a;B cee 


ai F 0, 
Onxn Onn 
(1) (1) (s) 
a2, B 42, 741B ' 
(1) (1) (s) 
a;,B 43,7418 


(al 
arr 


(1) (s) 
Ar+1 ,B 


) ()) (s) 
B G+, r41B 


(1) (s) 
A741, r4+1B(s) 














68 MANOHAR N. VARTAK 


so that 
o2B ofB Lo. PB aft B® 
ay: B a3;B co. aoe 654B™ 
ID, | = a3, |B| oe eee eee 
a%B  aB ... a PB a B® 
41,23 (2) are 3Bis)... as+1,7B(s) O41, 741B is) 


where 0,,., is the zero matrix of order m X n, 
(1) @i3 411 — Ai Aiy . é 
i s«— ae 37 1,2,...,7°+ 1 


ai1 
and in particular 
ay: — |As |A,| # 0. 


Proceeding in this way we finally get, since none of |A,| is zero, 





B Ons 
| Lo jr—1 (r—1) 
\D,.| = |A,|'|B| Gr+1.1 B (r) Bo | 
@—T) P(s) Ar+i,r+1 (9) | 
rr 
where 
(r—1) (r—1) (r—1) _(r—)) | 
(r) — Ort. 141 Orr —~ Grete Orir¢d |A p41] 
Gr41,741 = ge op Se of () 
arr |A,| 
Hence 


[D.| = |A,|""*|Ar41|"|BI’|B,|, 
which proves the Lemma. 


It is interesting to note that if we let r = m—1 and s =n, then 2.1 
reduces to a well-known result (3) 


2.3 |A X B| = |A|*|B|”. 
We are now in a position to prove the following: 


THEOREM 2.1. Let A be a symmetric matrix of order m and B be a symmetric 
matrix of order n. Let the elements of A and B be rational numbers and assume 
that all the leading principal minor determinants of A and B are different from 
zero. Then 
24 (A X B) = (— 1, — 1)3**" {c,(A)}" {,(B)}" (JA], — DO” 

x (|BI, — 1)3""" (A], |B) 

Proof. In the first place observe that A X B is symmetric and none of the 
leading principal minor determinants of A X B is zero since the same pro- 
perties hold for A and B. Thus c,(A X B) has a meaning. Further from 1.3 
and 2.1 we have 


m—1 n 


(—1,-—1),[] [] (\A-\""*|As4:/*|Bl" |B |, 
r=0 s=] 
— |A,|***"|A ,41|" "|B" |B,-1])»- 


2.5 c,(A X B) 








NN 





to 


ro- 


1.3 








ON THE HASSE-MINKOWSKI INVARIANT 69 


Since the Hilbert norm residue symbol (a, 5), satisfies the property 
2.6 (a; a2, b)y = (a1, b)y (a2, 5), 


it follows that the factor written after the product signs on the right hand 
side of 2.5 breaks into twenty factors which are simplified as follows, where 
we have dropped, for convenience, the subscript p: 
(i) (|A,|"", — 1) = (JA,|, — 1)", 
(ii) (|A,/"*, |A-|""") = (Az), [AO = 1, 
(iii) (As, |Angal?™) = (lA, [Areal = (lA, [Areal 
(iv) (|4,/""*, |BI’) = (|A,|, |BI)"*, 
(v) (|A,|", |Baal) = (|Ar|, |Be-al)”™, 
(vi) (|Arsal*, — 1) = (|Arail, — 1)’, 
(vii) (|Argil’, |As|”"*") = (JArl|, |Avga|) = (JA, [Artal 
(viii) (|A real", |Artal?™) = (JAraal, — 1)°°” = 1, 
(ix) (|A+41]’, |BI’) = (|Araal, |BI)™. 
(x) (|Arsil’, Boal) = ({\Arsal, |B,-1|)’, 
(xi) (|B|’, — 1) = (|BI, — 1)’, 
(xii) (|B\’, |A,!""*") = (|A,|, |BI)"""*”, 
(xiii) (|Bi’, |Arsa|*") = (Areal, |B)", 
(xiv) (|BI’, |B\") = (|B, — 1)’, 
(xv) (|B]’, |Bss!) = (|B, |Beal)’, 
(xvi) (|B,|, — 1), 
(xvii) (|B,|, |A,|""**") = (|A,|, |B,|)*"*", 
(xviii) (|B,|, |Arsal") = (JArsal, |Bsl)™, 
(xix) (|B,|, |B\’) = (\BI, |B,l)’, 


(xx) ({B,|, |B,-1|) = (|B,-,|, |B,]). 





All the well-known properties of the Hilbert norm residue symbol, in addition 
to 2.6, have been made use of in the above simplifications. 
The factors (ii) and (viii) drop out automatically. The factors (iii) and 
(vii) give 
( rma 1A p41 ai A,|, A +41 . = ( A,|, |A y4al)”. 
The factors (iv) and (xii) give 
( A, : iB ‘pote A,l, my? =a ( A, |, B # 
The factors (ix) and (xiii) give 
(|A n+l) |B))"*(|A +1) \B\)"” = (|A r+1l, |B\)’. 
The factors (xi) and (xiv) give 


(|B, — 1)’ (|B|, — 1)’ = 1. 











70 MANOHAR N. VARTAK 


Hence the factor on the right hand side of 2.5 reduces to 


2.7 f = (|A,|, |A pail)* (|A-l, |BI)’ (\A rail, |B|)’ (|A,|, — 1)" (|A +41], — 1)’ 
X {(IA-|, Bel)” (Az), [Bo-al)”"} { (lA real, [Bel (A ras, |Be-al)"} 


x {(|Bl, |B,|)’ (\B|, |B,-1|)"} {(|B,|, at 1) (\B,|, |B,-1|)}. 


Now 


IT (14-1, BA (1441, [Beal)} 


(4 YET lash BDF (del IBD 
(|A,|, |B}) 


since (|A,|, 1) = (\A,|, 1), = 1 for any prime p. 
Similarly 


I] {(|A s4il, |B,|)*" (|A 4a, |B,-1|)*} = (|A rts |B Pr. 


II (CBI, |B.))’ (BI, |B.!)"} = (BI, — 1)’ 


s=l 


and 
I] {(1B.|, — 1) ({Bs|, |Bsal)} = (— 1, — 1) {e(B)} 
from 1.3 and 2.6. 
Hence we get 
F = [] tf} = (Ar), Avsal)™ (4c, [BI (lArsal, [BD 


x (|Arl, — 1)"°°? (|Araal, — 1)" (A,|, |B) 
X (JA rail, [BI)*~ (JB|, — 1)’ (— 1, — 1) {e(B)}. 


But 
(lArl, |Areal)™ = (lArl, [Argal)"-O* = (lAraal, 14el)”. 
Hence 
28 F = (— 1, — 1) {c(B)} (|BI, — 1)’ (\A,|, — 1)" 
X (lArsal, — 1)? (lA rail, — |A,])" 
X {(|A,|, [BI (JA sal, [BI)"**}. 
Now 
m—1 
I], (1B), — 1)’ = (1B|, — 1)", 
m—1 


I] {(|A,, _ 1" (1A pal, a 1)"*-*) = (\A|, oe 1)"** 
r= 


and 














fr 








ON THE HASSE-MINKOWSKI INVARIANT 71 


TT (Aral — [4e)"} = ((— 1, — 1) (4p 


from 1.3 and 2.6. 
Finally 


II {(4- 


r=0 





’ |\B\)""*" (\A veal iB)" 
= (|A|, |B|)"”. 
Making use of all these simplifications, we get 


m-—1l 


I] {F} =(-1l1,- iy {c(B)}™ x (Al, = 1)» 
x {(— 1, — 1) €(4)}* (1B, — 1)" (A], |B). 


This, after a little rearrangement and restoring the prime p, reduces to 2.4. 
This completes the proof of the theorem. 


3. Some particular cases. We shall show that two well-known formulae 
are particular cases of the result 2.4. 

Jones (2) has shown that if.a is a non-zero rational number and B is an 
n Xn matrix whose Hasse-Minkowski invariant is defined, then 


3.1 cp(a B) = c,(B) (a, — 1)}*"*” (a, |BI)>*. 
This can be easily deduced from 2.4 by observing that a B =a X B, a 


being a scalar. 
MacDuffee (3) has defined the direct sum of matrices A and B by the relation 


. A 0 
Atp=|4 °| 


where 0 is a null matrix of appropriate order. Let B be an m X nm matrix whose 
Hasse-Minkowski invariant is defined and let 


An =B+B+...+B 
there being m B’s in the direct sum. Bose and Connor (4) have shown that 


3.2 Cp(Am) — (— l, om 7 {c,(B)}" ( Bi, me 2) jm(m » 


p 
This can also be deduced as a particular case of 2.4 by observing that 
An=B+i+B+Bi...+Be= I, X B where I,, is the identity matrix of 
order m. 
Applications of the result 2.4 to some combinatorial problems connected 
with statistical designs are being investigated (5). 


4. Summary and acknowledgment. In this paper the Hasse- Minkowski 
invariant c,(A X B) of the Kronecker product of matrices A and B is obtained 
in terms of c,(A), c,(B), |A| and |B|; and two well known results regarding the 
Hasse-Minkowski invariant are shown to be particular cases of the result. 














72 MANOHAR N. VARTAK 


I wish to express my sincere thanks to Professor M. C. Chakrabarti for his 
kind interest in this work. 


REFERENCES 


F. D. Murnaghan, The Theory of Group Representation (Baltimore, 1938). 

. B. W. Jones, The Arithmetic Theory of Quadratic Forms (New York, 1950). 

. C. C. MacDuffee Theory of Matrices (New York, 1946). 

. R. C. Bose and W. S. Connor, Combinatorial properties of group-divisible incomplete block 
designs, Ann. Math. Stat., 23 (1952), 367-383. 

. M. N. Vartak, On an application of the Kronecker product of matrices to statistical designs, 

Ann. Math. Stat., 26 (1955), 420-438. 


on = 


nn 


University of Bombay 
India 














A FAMILY OF DIFFERENCE SETS 
R. G. STANTON AND D. A. SPROTT 


1. Introduction. A difference set (G,D) is defined in (2) as a subset 
D of k elements in a group © of order v with the following properties: 


(1) if x € G, x ¥ 1, there are exactly \ distinct ordered pairs (d;, dz) of 
elements of D such that x = d;~'d,; 


(2) if x € G, x #1, there are exactly \ distinct ordered pairs (d3,d,) of 
elements of D such that x = d;d,~'. 


If @ is abelian, the difference set (G,D) is said to be abelian; if G is cyclic, 
(G,D) is said to be cyclic. 

A cyclic difference set was earlier defined as a set of & distinct residues 
d,,dz,..., 4, (modulo v) with the property that all non-zero residues modulo 
v occur exactly A times among the differences d, — d, (i # j). Such sets have 
been studied in detail (3). Although the general difference set (@,D) was 
introduced in (2), the abelian difference set was used previously (1) to con- 
struct symmetrical balanced incomplete block designs. Thus, if »v = 4A + 3 
= p", where p is prime, and if x is a primitive element of GF(p"), the set 


0 2 4 4 
2 ¢ ae 


is an abelian difference set. If m = 1, so that v is prime, then this becomes a 
cyclic difference set modulo ». 

In (3), a generalization of previous work on cyclic difference sets is presented, 
and a system of classification is given for sets in which k < 4vand3 < k < 50. 

It is the purpose of this paper to prove the existence of the abelian difference 
set with parameters v = p"(p" + 2), k = $(v— 1), A = }(v — 3), where 
pb” and gq” = ~" + 2 are both prime powers. This includes the balanced in- 
complete block design v = 6 = 63, r = k = 31,4 = 15. If n = m = 1, the 
result is a family of cyclic difference sets including the case v = 35, k = 17, 
\ = 8, given in (3). In what follows, p and q will always denote primes, and 
x and y will be primitive elements of GF(p") and GF(q") respectively; also, 
gq" = p* + 2. 

The difference set will be constructed by using the Galois Domain GD(p), 
that is, the set of elements (a,8), where a € GF(p") and 8 € GF(q”"). Addition 
and multiplication are defined by the relations 


(1,81) + (@2,82) = (a1 + a2, 8:1 + 82), 
(a,8;) (a2,82) = (aja, 8,82). 


Received March 27, 1957 
73 














74 R. G. STANTON AND D. A. SPROTT 


It will be convenient to name the elements of GD(v) as follows: 
z' = (x‘y'), w' = (x‘0), u’ = (y'), 0 = (0,0). 
In particular, 2's’ = 2‘*’. 
The elements of GD(v) form an additive abelian group; the subset of ele- 
ments of the form z‘ is a multiplicative abelian group of order }(s* — 1). 


The method of counting differences between elements is similar to that used 
in (4) and (5). 


2. Construction of the family. We prove the 


THEOREM. The elements 


git?-v-1 


(2°2,2°,..., ,0,w’,w,...,w*”) 


form a difference set with parameters v = s(s + 2),k = $(v — 1),A = 3 (oe — 3), 
wheres = p",s +2 = q". 


Proof. The differences are of three types: 
(1) differences w‘(w’ — 1) and + w'; 
(2) differences + (z‘ — w’) and + 2°; 
(3) differences 2‘(z’ — 1). 
The differences of type (1) can be written in the form 
(x‘(x? — 1), 0) and (+ x‘, 0) 


where i ranges from 0 to s-2, and j ranges from 1 to s—2. For j fixed, all elements 
of the form (x‘,0) = w‘ occur once among the differences (x‘(x’ — 1), 0); 
hence, as j ranges, these elements occur s—2 times among such differences. 
Thus, the differences of type (1) produce each element w‘ a total of (s — 2) + 2 
= s times. 

Now consider the differences of type (2), in particular, the differences with 
negative signs 

(x"(x** — 1), — 9"), — ey’). 

Elements of the form w‘ = (x‘,0) and 0 = (0,0) do not occur at all among 
these differences. For each value of i, 7 may range from 0 to s—2; hence the 
term x‘(x*-* — 1) takes on s—1 distinct values. As i ranges from 0 to s, the 


total number of distinct elements occurring among these negative type 
(2) differences is 


(s — 1)(s +1) + (s+ 1) = s(s + I). 


Thus all elements not of the form (a,0) occur once. Taking into account 
the plus and minus signs, and allowing i to range from 0 to }(s? — 1) — 1, 
we see that all elements not of the form (a,0) occur among the type (2) 
differences a total of (s? — 1)/(s + 1) = s — 1 times. 





whe 








DIFFERENCE SETS 75 


Finally, consider the type (3) differences z‘(z’ — 1). These are elements 
of the form 


(x*(x? — 1), y‘(y’? — 1)). 


If 7 = c(s — 1), where c = 1,2,...,4(s — 1), these elements have the form 
(0, y*(y? = 1)). 
For a given c, all elements (0,5) occur 4(s? — 1)/(s + 1) = 4(s — 1) times; 


as c ranges, such elements occur }(s — 1)* times. Since they occurred s — 1 
times among the differences of type (2), they must occur }(s? + 2s — 3) 
times in all. 


Similarly, if 7 = d(s+ 1), where d = 1,2,...,4(s — 3), the type (3) 
differences are of the form 
(x*(x** — 1), 0). 


For d fixed, all elements of the form (a,0) occur $(s + 1) times; as d ranges, 
they occur }(s + 1)(s — 3) times. But they occurred s times among the 
differences of type (1); so they must occur }(s? + 2s — 3) times in all. 

If 7 does not have the form c(s — 1) nor the form d(s + 1), then*for 7 
fixed, the differences 


(x‘(x’ — 1), y‘(y’ — 1)) 
range over half of the s? — 1 elements of the form (a,b); for the same j, the 
differences 
— (x‘(x’ — 1), yy’ — 1)) = @&’(e’ — 1), 9‘ — 1) 

range over the other half of the elements of the form (a,b). For, if this were 
not so, we should have 

i+ 4(s — 1) =r (mod s — 1), 

i+4(s+1) =r (mods + 1). 


Writing these congruences as equations, and subtracting, we find 
ki(s — 1) — ko(s +1) +1 = 0, 


where both k, and k: are integers. This is impossible, since 2 is a divisor of 
s—lands+1. 
Thus we conclude that, for j fixed, the differences 


+ (x*(x’? — 1), y‘(y’ — 1)) 


range once over all elements of the form (a,b). As j ranges, these elements will 
occur a total of ${4$(s* — 1) — 1 — 4(s — 3) — 4(s — 1)} = }(s*? — 2s + 1) 
times. Since they occurred s — 1 times among the differences of type (2), 
they will occur }(s* + 2s — 3) times in all. This completes the theorem: 
all non-zero elements (a,0), (0,5), and (a,b) occur \ = }(s? + 2s — 3) times. 














76 R. G. STANTON AND D. A. SPROTT 


3. Example. Take p = 7, » + 2 = 3°. To construct GF(9), we use the 
irreducible polynomial f(x) = x* + 2x + 2; then x is a primitive element, 
and the field is made up of elements 
x = 12,0 =x+1,x° = 2x + 1,2 = 2,2° = 2x, x° = 2x 42,x7 = x 42, 
together with 0. Since 3 is a primitive root modulo 7, we take z = (3,x), 
w = (3,0), 0 = (0,0). The difference set is 

(2°, 2,2°,...,2°,0,w',w,w’,...,w’), 
with parameters v = 63, k = 31, = 15. Two cyclic difference sets with these 
parameters are given in (3). 

To show that the difference set constructed in this example is not isomorphic 
to either of these cyclic difference sets, we consider incidence relations among 
the blocks of the corresponding balanced incomplete block designs. We first 
note that if, in any one of these three designs, there exist seven blocks with 
seven common elements, then these seven common elements must differ by 
an element of additive period seven. It follows at once that the 2 cyclic designs 
of (3) have the property that 0,9,18,27,36,45,54 all occur in seven blocks, and 
they are the only elements with this property; in the design constructed in 
this example, the elements (0,0), (1,0), (2,0), (3,0), (4,0), (5,0), and (6,0) 
all occur in seven blocks, and they are the only elements with this property. 
Consequently, if the design under consideration is to be isomorphic to either 
of the designs in (3), then the elements (0,0), ..., (6,0) must correspond to 
the elements 0, . . . , 54 in some order. 

Now both the cyclic designs have one property in common; if any pair of 
the elements 0,9,18,27,36,45,54 occurs in a block, then a third determined 
element occurs in that block, that is, any triplet of these numbers either does 
not occur in any block, or it occurs in 15 blocks. In the case of the projective 
design, this follows from the fact that any two points determine a line, and 
any 4-space through the line must contain the third point on that line; in 
the case of the other cyclic design, it is clear that the triplet (0,9,45) occurs 
15 times and hence, by addition, so do the triplets (9,18,54), (18,27,0), 
(27,36,9), (36,45,18), (45,54,27), (54,0,36); since this accounts for all 21 
possible doublets of these elements, no other triplets can occur. 

On the other hand, the triplet structure of the present design is quite 
different; consider, for example, the pair of elements (0,0) and (1,0). They 
occur together in 15 blocks, of which seven blocks contain the other elements 
(2,0), (3,0), (4,0), (5,0) and (6,0). But an elementary discussion of con- 
gruences shows that the other 8 blocks containing (0,0) and (1,0) can contain, 
in each case, only one element of the form (a,0). For 4 of these blocks, it is 
(3,0); for the other 4, it is (5,0). Thus, in this design, the triplets (0,0), (1,0), 
(3,0) and (0,0), (1,0), (5,0) occur exactly 11 times. This completes the demon- 
stration that we do not have either of Hall’s cyclic designs. 








an 


or 


nm wh 








DIFFERENCE SETS 77 


If » = m= 1, the result is the family of cyclic difference sets with 
v= p(p +2), k= 4(o — 1), A = 4(v — 3). In this case, the pairs (x,y) 
and (x,0) can be replaced by the residues z and w modulo v defined by 


= x (mod p), z 
x (mod p), w 


y (mod p + 2); 
0 (mod p + 2). 


Ill 
Ill 


Zz 
w 
4. Two further examples. (i) Suppose p = 3, p + 2 = 5; then v = 15, 


k = 7, \ = 3. Since 2 is a primitive root mod 3 and mod 5, we can take 
z = 2, w = 5, thus generating the difference set 


(2°, 2, 2”, 2°, 0, 5, 5°) = (1,2,4,8,0,5,10). 


Of course, this set could also have been obtained from a projective geometry. 
(ii) Suppose p = 5, p + 2 = 7; then v = 35, k = 17, \ = 8. Here 3 isa 
primitive root mod 5 and mod 7; so z = 3, w = 28. The set is 


(3°, 3,3°,..., 3", 0, 28, 28”, 28°, 28°), 
or 
(1,3,9,27,11,33,29,17,16,13,4,12,0,28,14,7,21), 


as given in (3). Since 2 and 3 are primitive roots mod 5 and mod 7 respectively, 


z = 17 and w = 7 could also have been used to generate this difference set. 


REFERENCES 


1. R. C. Bose, On the construction of balanced incomplete block designs, Ann. Eugenics, 9 (1939), 
353-399. 

2. R. H. Bruck, Difference sets in a finite group, Trans. Amer. Math. Soc., 73 (1955), 464-481. 

. Marshall Hall, A survey of difference sets, Proc. Amer. Math. Soc., 7 (1956), 975-986. 

4. D. A. Sprott, A Note on balanced incomplete block designs, Can. J. Math., 6 (1954), 341-346. 

. D. A. Sprott, Some series of balanced incomplete block designs, Sankhy4, 17 (1956), 185-192. 


w 


University of Toronto 














NETWORK FLOW AND SYSTEMS OF 
REPRESENTATIVES 


L. R. FORD, Jr., anp D. R. FULKERSON 


Introduction. The theory developed for the study of flows in networks 
(2; 3; 4; 5; 6; 7) sometimes provides a useful tool for dealing with certain 
kinds of combinatorial problems, as has been previously indicated in (3; 4; 6; 7). 
In particular, Hall-type theorems for the existence of systems of distinct 
representatives which contain a prescribed set of marginal elements (10; 11), 
or, more generally, whose intersection with each member of a given partition 
of the fundamental set has a cardinality between prescribed lower and upper 
bounds (9), can be obtained in this way (7). In this note we apply netgvork 
flow theory to generate necessary and sufficient conditions for (a) the oaleal 
of a system of restricted representatives, by which we mean a system of 
representatives such that each element a, of the fundamental set occurs at 
least a, times in the system, and at most f; times, and (b) the existence of a 
common system of restricted representatives for two different collections of 
subsets of the fundamental set. While problem (b) clearly includes (a), we 
have chosen to treat the two separately. 

Section 1 describes relevant portions of flow theory. In §2 we show how 
Hall’s condition for the existence of a system of distinct representatives and a 
similar condition for problem (a) may be deduced from maximal network flow 
problems. Section 3 deals with problem (b) and resolves (a) as a special case. 

We emphasize that the present approach may be used not only to yield 
existence conditions for certain kinds of systems of representatives, but may 
also be used to provide explicit algorithms for constructing such as well. 
On the other side of the ledger, it can be shown, although we do not demon- 
strate it in this paper, that each of the problems we have mentioned can be 
reduced, by suitably manipuiating the network which represents the problem, 
to an application of Hall's theorem. 


1. Network Flow. A basic problem concerning network flows is the follow- 
ing. Suppose given a finite network (linear graph) N with node set {s, ..., x, 
y,...,5’} and oriented arcs joining pairs of nodes, the arc from x to y being 
denoted by (x, y), and suppose each (x, y) has associated with it a capacity 
c(x, y), where c(x, y) is either a non-negative real number or plus infinity. 
Subject to the conditions (i) the flow in (x, y) is no greater than c(x, y), 
(ii) the total flow into node x (x # s, s’) is equal to the flow out, find a maximal 
flow from s (the source) to s’ (the sink). 


Received March 18, 1957. 
78 











NETWORK FLOW 79 


Thus, letting f(x, y) be the flow in (x, y), the problem may be described as 
a linear programme: 


(a) DD Uf») -f0,s)] -—v =0 





(b) Dd Uf (x,y) — fly, x)] =0 (x#s,s’) 


\oe = fe.» -f0,s)] +0 
Re 0 < f(x, ¥) < c(x, y) 


(e) maximize v. 


(1) 


Il 
>) 


If (f; 2) is a solution of the constraints (la) — (1d), f is a flow and » its value. 

There are algorithms available for solving such problems. The best known 
of these is probably G. Dantzig’s simplex method (1) for solving the general 
linear programming problem of maximizing a linear function subject to linear 
equations and inequalities. However, problem (1) is a special kind of linear 
programme for which simple (and computationally more efficient) algorithms 
have been constructed (2; 4). These algorithms may be used to prove an 
intuitively plausible theorem which is basic in the study of network flow. 
To state this theorem, we require some definitions. A cut in N with respect to 
s, s’ is a partition of the nodes into two complementary sets L, L’ with s € L, 
s’ € L’. The value of a cut is 

y c(x, y). 
zreL.yeL’ 

MINIMAL CUT THEOREM (3; 4; 5). For any network, the maximal flow value 

is equal to the minimal cut value. 


We remark that it is obvious that flow values are bounded above by cut 
values. Thus the content of the theorem is the assertion that there is a flow 
and a cut for which equality of values holds. 

In addition to this theorem, we need one other result for the combinatorial 
applications to be presented in the sequel. 


INTEGRITY THEOREM (3; 4). If the capacity function is integral valued, there 
exists a maximal flow which is also integral valued. 


The integrity theorem can also be deduced in a variety of ways. For 
example, the algorithms for constructing maximal flows which were referred 
to previously can be shown to produce integral flows in case the arc capacities 
c(x, y) are integers. The theorem also follows from the fact that all the extreme 
points of the convex polyhedron defined by (la)-(1d) are integral. 


2. Hall’s Theorem; Systems of Restricted Representatives. Let 
S = {S;,...,5S,} be a family of subsets of a given set A = {a;,..., Gm}. 
A list R of (not necessarily distinct) elements 











80 L. R. FORD, JR. AND D. R. FULKERSON 


ae 


is a system of representatives for S if 
a; € S;, fae Docc gt 


If we further stipulate that each element a, € A occurs in R at least a, times 
and at most 6; times, where 0 < a; < 8;, we call R a system of restricted repre- 
sentatives (abbreviated SRR). In case a, = 0, 8; = 1 for all i, then R is a 
system of distinct representatives (abbreviated SDR). A well-known theorem 
of P. Hall (8) states that a necessary and sufficient condition for the existence 
of an SDR is that, for each k = 1,...,m, every union of k sets of Y contains 
at least k elements. The necessity of the condition is of course obvious. 

As an exercise, let us construct a network maximal flow problem which 
represents the problem of finding an SDR and deduce Hall’s condition from 
it. To this end, let 


s, 8,, . ee , Be, ar, eee Oe 

be the nodes of NV, and define arcs and capacities as follows: 
(s,8,) with capacity 1, $MM. iy 
(S,, 4) with capacity ~, 1,j2 a, € S, 
(d;, s’) with capacity 1, ¢=1,...,™. 


We assert that an SDR exists for .Y if and only if the maximal flow value in 
N is n. For, given an SDR, we can construct an integral flow of value m as 
follows. Let 
f(s, S,) 1 

as « j1 if a, occurs in the SDR, 
FS» a) \0 otherwise 

i ae 1 if a, occurs in the SDR, 
f(a 8) o otherwise. 


This is clearly a flow in N of value m; it is certainly maximal since the cut value 


ps c(s, 8,) 


j=l 

is also m. Conversely, if the maximal flow value is m, we may select (by the 
integrity theorem) an integral flow of value n, and let a; represent S, if and 
only if f(S,, @;) = 1. Then all sets S, are represented (since the flow has value 
n and c(s, 8,;) = 1) and no a, occurs more than once in the representation 
(since c(d,;, s’) = 1). Thus an SDR exists for Y if and only if the maximal 
flow value (that is, minimal cut value) for the associated network is n. 

To discover Hall’s condition, we simply examine all candidates for minimal 
cuts, and insist that their values exceed n. First let us introduce some notation. 
Given two disjoint subsets X, Y of the nodes of a network N, let (X, Y) de- 
note the set of arcs from any node of X to any node of Y, and let 

(X,Y= > nC 9). 


(ry «x 








an Quite Seu. 














NETWORK FLOW 81 


Also, for any set X of nodes, let J(X) be the set of nodes of N which are 
joined to some node of X by an arc. Finally, let |X| denote the cardinality of 
set X. 

Suppose now that (L, L’) is a cut in a representing network N for the SDR 
problem. Let § = {8,,...,8,}, A = {d:,...,4,}, and define subsets of the 
nodes of N as follows: 


X=LOS;X' =L'N8;Y=LOA; VY =L'OA. 
Then the condition which is equivalent to the existence of a flow of value n is 
(2) c(L, L’) = c(s, X’) + c(X, Y’) + c(¥,s’) >on 


for all cuts (L, L’), or equivalently, for all X C 8, Y C A. 

Now (2) holds automatically unless (X, Y’) is vacuous, and (X, Y’) empty 
implies J(X) (\ A C Y. Thus the set of inequalities (2) is equivalent to the 
set 
(3) IX"| + |¥l/ >n 
for all X C S, all Y D J(X) OA, and hence to 
(4) IX’| + \J(X) Al >on 
for all X C 8. Replacing |X’| by n — |X| in (4) yields 
(5) |X| < |J(X) ON Al, all X C 8. 
All that remains is to restate (5) in the language of sets: for any subset X of 
the indices {1,..., »}, 

(6) |X| < |Z(X)|, 
where J(X) C {1,..., m} is the index set of U S,(j € X). 

With this as background, let us next turn to the question of the existence 

of an SRR. For this problem, let 


, 


Gis ocak cccslnhé 


be the nodes of N; the arcs and capacities are 


(s,8,) with capacity l, jul,..;,%, 
(8,, 4,) with capacity @, i,j Da, € Sy, 
(d,,t) with capacity 8B, — a, eee 
(d,, s’) with capacity a4, fe, 


(t,s’) with capacity n — 7 Oy. 
i=l 


(Notice that we are tacitly assuming m > >> a;, obviously a necessary con- 
dition for the existence of an SRR.) 

It is not difficult to see that an SRR exists if and only if the maximal flow 
through N has value n. Define X, Y, X’, Y’ as before, and suppose first that 
t € L. Then the relevant condition is 


(7) c(s, X’) + c{X, VY’) + c(¥,s’) +c, s) an 











82 L. R. FORD, JR. AND D. R. FULKERSON 


for all X C 8, Y C A. Proceeding as before, (7) leads to 


(8) X1+ Datn- Darn 

for all X C 8, all Y D J(X) CA, and thus to 

(9) IX|<n- Dat 2D a 
i=1 J(X)NA 


for all X C 8. 
In a similar manner, if ¢ € L’, one obtains the condition 


(10) IXl< Dd B: 


J(X)NA 


for all X C 8. Thus we may state (since (9) includes the condition 


n— >, a>0): 


i=1 


THEOREM 1. An SRR exists for SY = {Si,...,5S,} if and only if, for every 
subset X of the indices {1,..., m} 


(11) \X| < min (n — = act > ay, Dd Bd) 


i=l I(x) I(x) 


where I(X) C {1,...,m)} is the index set of US,(j € X). 


Observe that (11) reduces to Hall’s condition in case a; = 0, 8; = 1 for all 
i. Also, if a, = 1, for 1 = 1,...,¢g, and a, = 0 for i=q+1,...,m, all 
8B, = 1, (11) yields the Hoffman-Kuhn condition (10) for the existence of an 
SDR containing a prescribed set of marginal elements ay, .. . , dg. 


3. Existence of Common SRR. Since the only ingenuity required in 
solving problems of the kind we have discussed lies in finding a representing 
network (if one exists), we shall merely give a description of such a network 
for the common SRR problem and a statement of the conditions, leaving the 
proof to the reader. 

Let S= {Si,...,5S,}, JZ = {T1,..., T,} be the two families of subsets 


of A = {a,,...,@m}. Define a network N consisting of nodes 
~~ aE 2 ee | ae ee 

and arcs 
(s,8,) with capacity 1, er n, 
(s,@,) with capacity a,;, ¢=mi1,..., mM, 
(S,, @,) with capacity ~, 1,j da, ES; 
(d;, @;) with capacity 8B; — a,, ¢=a1,..., mM, 
(d,, s’) with capacity a,, i Bee m, 
(a, T ,) with capacity ©, 1j3a,€T;, 
(T,, s’) with capacity 1, Se eee n. 





ery 








NETWORK FLOW 83 


As a common SRR exists for .% 7 if and only if there is a flow from s to s’ 
of value 


m 
n + u Qi, 
= 


similar procedures lead to the following theorem. 


THEOREM 2. A common SRR exists for Y = {Si,..., Sn}, 7 = {T1,..., Ta} 
if and only if, for every X, Y C {1,..., nm}, 


(12) IX|+|1¥i<an-Dat DY at DY SB, 
t=1 mxyulrctYr) rmxeynrcyr) 


where I(X)C{l1,...,m} ts the index set of US; (fj €X), and 
I(Y) C {l,...,m} ts the index set of UT;(j € Y). 


Notice that, for any given X, taking Y empty yields 
|X| <n —- . art Do ay 
and taking Y the full set yields 
Ki<-Lat 5 at F ache 
t=—1 Mx ity) myATY) 1(X) 


which combine to give (11). Conversely, if S; = 7, for all 7, and if (11) holds 
for all X, then (12) holds for all X, Y. To see this,' suppose given any two 
sets X, Y C {1,...,m} and apply (11) to the sets X U Y, X (/ Y, obtaining 
in particular 


IXU Yi <n- Dat day = m= Do ar + Pa ay 


1(xuy) rmxyurcyr) 


IXNYIi< DB< D Bs 
rOxny) 


Mxyn iy) 


Adding these two inequalities gives (12). 
By taking a, = 0, 8; = 1 in (12), one obtains conditions for the existence 
of a common SDR. 


COROLLARY. A common SDR exists for S and J if and only if 
(13) IX] + |¥l/ <at+ |(X)OTY)|, 
where I(X), I(Y) aré as defined in Theorem 2. 





‘This short proof is due to O. Gross. 





84 L. R. FORD, JR. AND D. R. FULKERSON 


REFERENCES 


1. G. B. Dantzig, A. Orden, and P. Wolfe, The generalized simplex method for minimizing a 
linear form under linear inequality constraints, Pacific J. Math., 5 (1955), 183-195. 
G. B. Dantzig and D. R. Fulkerson, Computation of maximal flows in networks, Naval 

Research Logistics Quarterly, 2 (1955), 277-283. 

, On the min cut max flow theorem of networks, Annals of Mathematics Study No. 38, 
Linear Inequalities and Related Systems, ed. H.W. Kuhn and A. W. Tucker (Princeton, 
1956), 215-221. 

4. L. R. Ford, Jr., and D. R. Fulkerson, A simple algorithm for finding maximal network 
flows and an application to the Hitchcock Problem, Can. J. Math., 9 (1957), 210-218. 

, Maximal flow through a network, Can. J. Math., 8 (1956), 399-404. 

. D. Gale, A Theorem on flows in networks, RAND Corporation, Research Memorandum 

RM-1737, 1956 (to appear in Pacific J. Math.). 

7. D. Gale and A. Hoffman, Circulation in networks (unpublished notes). 

8. P. Hall, On representatives of subsets, J. Lond. Math. Soc., 10 (1935), 26-30. 

9. A. J. Hoffman and H. W. Kuhn, On systems of distinct representatives, Annals of Mathe- 
matics Study, No. 38, Linear Inequalities and Related Systems, ed. H. W. Kuhn and 
A. W. Tucker (Princeton, 1956), 199-206. 

, Systems of distinct representatives and linear programming, Amer. Math. Monthly, 

63 (1956), 455-460. 
11. H. B. Mann and H. J. Ryser, Systems of distinct representatives, Amer. Math. Monthly, 
60 (1953), 397-401. 


bad 








a 





Santa Monica, California 














ON THE RANDOM DISORIENTATION OF TWO CUBES 
D. C. HANDSCOMB 


1. Introduction. We are given two identical symmetrical bodies (e.g., 
cubes) with independent random orientations; then we can always, in several 
ways, turn one of these bodies about some axis through its centre of gravity, 
so as to bring it into the same orientation as the other body. The smallest 
angle of rotation needed will be called the disorientation, d, of the two bodies, 
and we shall be concerned with the distribution of d under these conditions. 

Ignoring symmetry, the relative orientation of the bodies is given uniquely 
(modulo rotations of 27) by a single rotation; the required smallest rotation 
is the combination of this with some member of the symmetry group of the 
body. Using this fact only, and constructing random orthogonal matrices to 
describe the rotations, Mackenzie and Thomson (6) get an estimate of the 
distribution of d for cubes, by the Monte Carlo method. I shall now show how, 
by another method, the distribution can be found explicitly. 


2. Representation of rotations. We want a rotation to stand for the 
relative orientation of two independently oriented bodies; the distribution 
of rotations must therefore be invariant under any further arbitrary rotation 
of either body. Delthiel (5, pp. 99-106) sets out to obtain a distribution 
invariant in just such a manner. In the course of his work he represents the 
rotation # through angle V about the axis with direction cosines a, 8, y by 
the coordinates A = asin} V, uw = Bsin}V, vy = ysin}V, p = cos}V, satisfying 
2+ pw? + yp? + p? = 1. 

He goes on to find the law of composition of rotations so represented, which 
is (altering his notation slightly): 


d” = Xp’ + wv’ — mp’ + pr’, 
pl = — dv’ + up’ + wr’ + py’, 
vy’ = dp’ — pr’ + vp’ + pr’, 
= — Ad’ — up’ — wy’ + pp’, 


where the rotation #”’ is the resultant of Zand # in that order. Note that 
A, w, v», p and — A, — uw, — v, — p both correspond to the same rotation, a 
point which he does not mention. After this he changes his co-ordinates again 
and obtains the probability measure he is looking for. 

However the equation \* + u* + »* + p? = 1 above may be regarded as 
the equation of a hypersphere in 4-space, simply by taking A, u, », p as Car- 
tesian co-ordinates, so that every rotation corresponds to a pair of antipodal 


Received April 30, 1957. 
85 











86 D. C. HANDSCOMB 


points of the unit hypersphere. It is easy to verify that his measure is equiva- 
lent to the measure of hypersurface of this hypersphere; indeed its invariance 
under the group of rotations is evident from the law of composition, which 
shows that the transformation ()’, u’, v’, p’) > (A", uw”, v”’, p”’) is simply a 
rotation of 4-space when \? + yu? + v? + p? = 1. 

The identity .% corresponds to + (0,0,0,1) and &, the inverse of &, 
to + (A, u,v, — p). 

Suppose now that we apply an arbitrary rotation # to a symmetrical body; 
what is the smallest rotation needed to restore it to its original aspect? If G 
is the rotational symmetry group of the body in its original state, we want 
to find the rotation /€ G, such that #” = YF has the smallest possible 
value of V”. Now the fourth line of the law of combination states that 
cos}V” = RS (to use vector notation), and Y€ G implies that SE G, so 
that we need only look for the giving the largest value to R.S, which is the 
member S of the set of points of the hypersphere representing G, which is 


nearest to the point R representing #; we can then find the value of V”, 
which is d. 


3. Application to Cubes. Take lines parallel to the edges of one cube as 


axes; its 24 symmetry rotations are then represented by the following points 
and their antipodes: 


Identity (0, 0, 0, 1) 
Rotations of: 
x about (1, 0, 0) etc. (1,0, 0,0) (0, 1,0, 0) (0, 0, 1, 0) 
+ 3x about (1, 1, 1) etc. (4, 4, 4, 4) (— 4, — 4, — 4, 4) 

(— 4, 4, 4, 4) ( »—- 4, — 3,4) 

(4, — 4, 4,4) (— 4,4, — 4, 4) 

ote — 3,4) (— 3, — 4,4, 9) 
x about (0, 1, 1) etc. V4, V4, 0) (V3, 0, V3, 0) (V4, V3, 0, 0) 

(0, eae V4, 0) (V3, — v3, 0, 0) 


+ 4x about (1,0,0) etc. (/}, 0,0, V3) (0, V3, 0, V4) (0, 0, V3, V4) 
(— V3, 0, 0, V4) (0, — V3, 0, v4) (0, 0, — v3, v4) 


Now the first 12 of these points, with their antipodes, are the vertices of a 
24-cell, a regular polytope with Schlafli symbol {3, 4, 3} (3, p. 156); the 
remainder are the vertices of a reciprocal polytope, also a 24-cell, of the same 
size. This configuration is obtained directly from the cubic group by various 
methods by Coxeter (2; 4) and by Robinson (7). It is symmetrical, in the 
sense that all its vertices are equivalent. Since we want to work out a distri- 
bution involving only the vertex nearest to a random point of the sphere, we 
can with justice select any one vertex, say (1,0,0,0), and let the random 
point be taken only from the sector of the hypersphere which is nearer to 
this vertex than to any other of the 48. 




















DISORIENTATION OF TWO CUBES 87 


4. Calculation of the Distribution. We can now find the probability 
density for any angle d. The points of 4-space subtending with the point 
(1,0,0,0) an angle of 4d at the origin lie in the hyperplane A = cos$d, 
which meets the hypersphere in the sphere yu? + v* + p* = sin*}d. Those 
points which subtend a smaller angle with this vertex than with any other of 
the 48 lie in the hypersolid bounded by the hyperplanes \ + uw = +/2A, 
Atv= SY2,Atp = Y2A,A 4447 p = 0, which are the loci of points 
equidistant from (1,0,0,0) and from (/}, + V7}, 0,0), (3,0, + V3, 0), 
(V3, 0,0, + V4), (4, 4, + 4, + 4), respectively. These hyperplanes meet 
\ = cos $d in the 6 planes of a cube: yu, », p = + (/2 — 1)cos 4d, and the 8 
planes of an octahedron: + wu + v + p = cos}d, both concentric with the 
sphere. These 14 planes together bound a truncated cube, whose faces are 6 
regular octagons and 8 equilateral triangles. (See (1), Plate I, No. 15, for 
illustration. ) 

If d < 45°, the whole sphere lies within the truncated cube, so that p(d) is 
48 times the probability density of points of 4-space lying on this hypersphere, 
namely, 

48-}3-49-sin’}d x 

2a” 180 ’ 

where d is measured in degrees. Thus if d < 45°, 
p(d) = (2/15) (1 — cos d). 


If d > 45°, we have to reduce this by a factor equal to the proportion of 
the surface of the sphere lying outside the truncated cube. If 45° < d < 60°, 
the sphere meets the octagonal faces only. The proportion of area cut off by 
these 6 planes is 3{1 — (./2 — 1)cot $d}. Therefore if 45° < d < 60°: 


p(d) (2/15) (1 — cos d) (3(./2 —1) cot $d — 2) 
(2/15) (3(./2 — 1)sind — 2(1 — cosd)). 


If d > 60°, the sphere meets the triangular faces also; it does not meet the 
edges provided that tan $d < 2 — +/2, or d < 60.6°. The proportion of area 
cut off by these 8 faces is 4(1 — (1/+/3)cot 4d). Therefore if 60° < d < 60.6°: 


p(d) = (2/15) ({3(V2 — 1) + (4/¥V3)} sind — 6(1 — cosd)). 


If d > 60.6°, we have to increase this to allow for the sectors of sphere we 
have cut off twice, where an edge of the truncated cube goes inside the sphere. 
The proportion of the surface of a sphere common to the interiors of two 
small circles on it, of angular radii A, B, whose centres subtend an angle C 
at the centre, is: 


S(A,B,C) = = are cos (sme4 ome B em) 


sin A sin B 


= 4 (= B — cos Ccos 4) — oo( 284 _ sxe Bows £) | 
cos A arc cos in ain cos B arc c sin Bsin C 

















88 D. C. HANDSCOMB 


Each octagonal face meets the sphere in a circle of radius 
a(d) = are cos((/2 — 1)cot $d), 
each triangular face in one of radius 
b(d) = arc cos ((1/+/3)cot $d), 


two octagonal faces meeting at right angles and a face of each type meeting 
at an angle  —c, where cosc = 1/+/3. The maximum value of d is 
attained when all the vertices of the truncated cube lie on the sphere, when 
cos d = }(2,/2 — 1), d = 62.8°. Therefore if 60.6° < d < 62.8°: 


p(d) = (2/15)[{3(/2 — 1) + (4/V3)}sin d 
+ {12 S(a,a,3r) + 24 S(a,b,c) — 6} (1 — cos d)}. 


Substituting and simplifying: 
p(d) = (2/15) [{3(./2 — 1) + (4/V3)} sind — 6(1 — cos d)} 


" yo . (, cot "hd .) 
+ (8/52) (1 — cos d) paw Cus 3 + 24/2 — cot’ hd 


2. cot "4d — 2/2 .)' 
+ $ arc cos (© 3 — cot’ 4d ( 


a ail ; ares (. an 1) ) cot ad :) 
— (8/52) sin d 42(/2 — 1) arc cos — (/2 — 1)*cot*4d}! 


/ atid ofa — ign) 
+ (1/8) are cos ( 3 — cot ia) (. 


This completes the distribution. The mean works out to be 42.7°. 

I wish to record my thanks to the United Kingdom Atomic Energy Authority 
for a financial grant, to Messrs Mackenzie and Thomson for bringing this 
problem to my notice through a private communication and for independently 
checking my formulae, to Mr. J. M. Hammersley for helpful criticism, and to 
Dr. Coxeter for further references. 


REFERENCES 


W. W. R. Ball, Mathematical Recreations and Essays (11th ed., London, 1939). 

H. S. M. Coxeter, The binary polyhedral groups ..., Duke Math. J., 7 (1940), 367-379. 

Regular Polytopes (London, 1948). 

Regular honeycombs in elliptic space, Proc. Lond. Math. Soc. (3), 4 (1954), 471-501. 

R. Delthiel, Probabilités Géométriques, Tome 2, Fasc. 2 of E. Borel, Traité du Calcul des 
Probabilités ... (Paris, 1926). 

6. J. K. Mackenzie and M. J. Thomson, Some statistics associated with the random disorientation 

of cubes, Biometrika, 44 (1957), 205-210. 
7. G. de B. Robinson, On the orthogonal groups in four dimensions, Proc. Camb. Phil. Soc., 
27 (1931), 37-48. 








Aaron 


Christ Church, 
Oxford 











GEODESIC GROUPS OF MINIMAL SURFACES 


H. G. HELFENSTEIN 


1. Introduction. In a previous paper (6) we have studied those minimal 
surfaces which admit geodesic mappings without isometries or similarities on 
another, not necessarily minimal, surface. Here we determine all pairs of 
minimal surfaces which can be geodesically mapped on each other. We find 
that two such surfaces are either: 

(i) similar Bonnet associates of each other, or 

(ii) both Poisson surfaces (that is, isometric to a plane), or 

(iii) both Scherk surfaces (2). 

In case (i) the mappings are combinations of trivial transformations with the 
group of self-isometries of a minimal surface (7). In case (ii) the mappings are 
generated by the projectivities of a (complex) plane into itself, followed by 
isometries and similarities. Similarly in case (iii) the mappings are those of 
the geodesic group of a single complex Scherk surface combined with trivial 
transformations. The two groups in (ii) and (iii) have entirely different 
structures; for example, the latter is intransitive and mixed (discrete- 
continuous). 

If one admits only non-trivial mappings transforming a real two-dimensional 
domain on an image of the same type, then only the real projectivities between 
real planes will remain. This property might be termed ‘geodesic rigidity of 
real minimal surfaces,”’ in contrast to their ‘‘isometric flexibility.” 


2. Analytical formulation. First we dispose of the Poisson surfaces. 
Since their Gaussian curvature vanishes, any other surface onto which they 
can be geodesically mapped must be of constant curvature according to 
Beltrami’s theorem. From the Weierstrass representation of a non-cylindrical 
minimal surface one can conclude that their curvature is never constant. 
Hence Poisson surfaces can be geodesically mapped only on Poisson surfaces. 
According to (6, Theorem 2) this also settles the case of Lie surfaces. 

Dini’s theorem deals with two surfaces S and S’ admitting a non-trivial 
geodesic mapping. They must be of Liouville’s type, and their respective line- 
elements in corresponding points can be written in suitable coordinates x, y 
in the forms 


(1) ds’ = [A(x) + B(y)] (dx* + dy’), 
(2) am -(141)(@-). 


B A B 


Received March 25, 1957. This paper was prepared while the author was a fellow at the 
Summer Research Institute of the Canadian Mathematical Congress in Kingston, Ontario. 


89 














90 H. G. HELFENSTEIN 


The latter can be reduced again to Liouville’s form 


1 2 
(3) ds* = — (4 + 4) (dx’* + dy’) 
by means of the transformation 
, dx , f dy 
4 = | —..yv=)|]—~. 
@) =) VAw"” Vv — BO) 


In (6, pp. 321-332), we showed that the most general Liouville systems on 
non-cylindrical minimal surfaces are of one of the following two types, (5) and 


(7): 


(5) ds’ _ are = pe |"e™ (dx* + dy’), 

with the (complex) constants G, X, u, k, h, € satisfying 

(6) G #0,4 #0, + 0,n € 0, e=+1; 
or 

(7) ds* = G(2ex + H)’* e™ (dx* + dy’), 


with G # 0, « = + 1, Hand k arbitrary constants. 

We have therefore three cases to consider: 

(a) both surfaces are of type (5); 

(b) one is of type (5) and the other of type (7); 

(c) both belong to type (7). 

We shall consider in detail only (a) which is the only case that can actually 
arise; in a similar way one shows that cases (b) and (c) cannot occur. 

Let (x, y) be a system of co-ordinates establishing the mapping between S 
and S’ by association of points with the same set of co-ordinates. According to 
(5) it is a Liouville system on S of the form (1) with 


(8) A(x) = a(z) = Ge" (de* — nw)? — D, 

(9) B(y) = D, D a non-vanishing constant, 
where 

(10) z = 2ehx, 

and 

(11) B=ek=x-—1, notation of (6, formula (73)). 


Let (x’, y’) be the system introduced by (4) which reduces the line-element of 
S’ to the form (3). Since S’ is by assumption also of type (5) comparison of 
(3) and (5) shows that there must be constants G’, \’, wu’, k’, h’, €’ satisfying 
analogous conditions to (6) such that 


(12) a(z) - a’(2’) = — 1, 
whenever (4) holds. Here we defined 
(13) s = 2h'x’, B = &k’, 


(14) a’(?’) = GF" (Ne — pw’)? +1/D. 











GEODESIC GROUPS OF MINIMAL SURFACES 91 


Because of the reciprocity of the surfaces S and S’ the analytical problem 
may also be stated as follows: for which choice of the constants in (8) and 
(14) are the following expressions resulting from (4) inverse functions? 

4 ee 
eh v (a(z)) éh’ vV(— a’(2’))° 


3. Determination of the constants. From (12) we find by differentiation 


(15) 2'(z) = 


: da’ a’ 
= da a’ 
and from (15) 

dz’ ¢h’ l 


17 ae 
(17) dz ~ ch /(a(s)) 

Differentiating (8) and (14) with respect to their arguments z and 2’, dividing 
the results, and using (16) and (17) we obtain the following new relation 
between z and 2’. 


(18) B+ one — nw) De Vv (a(z)) . 

We can solve (18) for 2’ and substitute the result in (12), deriving the following 
identity which contains only the variable z. 

(19) G(r)” & Re)? + 4G'D')? Ph” a (2) S* (2) = 0. 


Here we made use of the following abbreviations: 


R(z) = — ehD[dX(B + 2)e* — Bu] — (8 + 2)eh’ V(al(z)) (Ae* — pw), 
S(z) = — &hD[A(B + 2)e° — Bu] — B’e’h’ V(a(z)) (Ae* — x). 
A discussion of (19) will now lead to the conclusion 6 = — 1. First we see 
that with 8 = — 2 equation (19) would be impossible; for a(z) would have at 


least one zero which, if substituted into (19), would lead to a contradiction. 
Hence we can assume 8 # — 2. The function a(z) is then, by the assumptions 
(6), an entire function of order 1 which is not of the form P(z) exp(Az), 
(P(z) a polynomial, A a constant) characteristic for such functions with at 
most a finite number of zeros (1). Consequently a(z) has infinitely many 
different zeros. Substituting two of them (2; and zz) into (19) gives 


(20) eae” = —= 5 ~, 22 — 2; = Zain. 


Together with a(z:) = a(z2) = 0, equations (20) yield e*** = 1, which shows 
that 6 is a real rational quantity 8 = m/n # 0,n > 0. 

It is easily seen that‘m > 0 must be excluded, for in that case the equation 
a(z) = 0 is equivalent to 


P(e) = e"(t — w)™ - (2y = 0, 














92 H. G. HELFENSTEIN 


where P({) is a polynomial of degree m + 2n in ¢ = e*. According to (20) 


is its only root. This is a double root, for one verifies that 


P(So) = P’(So) = 0, P'' (fo) € 0. 


Hence P(¢) must be of degree 2. But m + 2n = 2 contradicts 8 # — 2. 
Let therefore m = — M, M > 0. Now the equation a(z) = 0 becomes the 
polynomial equation 


Qs) = (ar — w)™ - (2) = 0. 


Here Q(¢) must be of degree two by a similar analysis. The case 2n = M is 
incompatible with 8 # — 2, while 2n < M would entail M = 2 which is 
impossible because of m > 0. Hence we conclude 2n > M. Under these cir- 
cumstances the degree of Q({) is 2m, or 2n = 2, M = 1,8 = — 1. 


By virtue of (12) we may replace a(z) on the right of (18) by — 1/a’(z’), 
solve for z’, and substitute into (12). We obtain an identity analogous to (19) 


in z’. A similar analysis of the roots of a’(z’) establishes also 6’ = — 1. 
Assuming 
(21) B= p’ = -1, 


we can reduce equation (19) to a polynomial identity in e* of the fourth degree. 
By comparison of the coefficients we obtain the following further conditions: 


= X oy = — 
(22) D IMG = TG 


(23) (*) —- | 


Equations (21), (22), and (23) are the necessary and sufficient conditions 
that non-trivial geodesic mappings are possible between two surfaces with 
line-elements of type (5). 


Since we used only the derivative of (15) there is still a constant of integra- 
tion to be determined. Under the conditions (21-23) we find from (15) that 
vy at vi — re) ey 
(24) . (tt Me Mee 
Substitution in (12) determines C, viz. 
c= — p’/d’. 

Distinguishing the possibilities for the signs of the square roots and of « 
and ¢’ one finds from (24) four different relations between z and 2’ which may 
be summarized as follows. Let y be a fixed value of ./( — A/uz), 6 a value of 
V/( — X’/y’) and define 
(25) Z = ye", Z’ = be". 














GEODESIC GROUPS OF MINIMAL SURFACES 93 


Then our maps are described by the following bilinear transformations 
(denoted by e, f, g, h for reasons explained later) : 


»_1+Z a 
o FT ww STs s' 
? ite ae AO ‘Pee ie 


In every case the corresponding connection between y and y’ is found from 
(4) and (9) to be: 


(26) y’ = (—D)*y+ yo. 


4. Geometrical interpretation. According to (6, p. 323), the equality of 
8 and 8’ indicates that two non-cylindrical minimal surfaces admitting non- 
trivial geodesic mappings on each other are similar-isometric, that is, they can 
also be mapped in a trivial way. The value 8 = — 1 shows that both of the 
surfaces are Scherk surfaces, obtained from the catenoid or the right helicoid 
by bending and similarities. They may be represented as in (2): 


‘x; = A(sin ¢ cosh X cos Y + cos ¢ sinh X sin Y), 
(27) {xX = A(sin @coshX sin Y — cos @sinh X cos Y), 
bin A(sin ¢X + cos ¢ Y). 


Here A # 0 and ¢ are arbitrary constants, and X, Y is a special system of 
Liouville coordinates. For the image surface we have a similar representation, 
and X = X’, Y = Y’ establishes the trivial mapping mentioned above. 
Besides X, Y there are other Liouville systems on S which are all given by 
transformation of the type 


X = cx + Xo, Y= +cy+ 


or 
X =cy+ Vo, Y = + cx + %. 


Identifying such a general x, y system with the system introduced in (5) and 
comparing the line-element obtained from (27) with (5) we see that equations 
(25) go over into 


(28) Z=é, Z'’=e* 


while (26), if (22) and (23) are taken into account, transforms into 
(29) Y’= +iY+ Vp, Yo = const. 


Here (X, Y) and (X’ Y’) are two corresponding points in the representations 
(27), and the non-trivial mappings are given by (e, f, g, h) and (29). 

Inspection of the line-element of (27) shows that, besides the trivial map 
described by X = X’, Y = Y’, there are three more classes of such isometric- 
similar maps. With the notation (28) they are all listed as follows: 














94 H. G. HELFENSTEIN 


(a) Z' = Z, (b) Z=-Z 


boo , = j 
©) Z=5, @ Z==>. 


and in every case 
(30) Y=+Y+ Fz. 


It is easy to see that the non-trivial transformations carry certain real lines 
into real curves, but no real two-dimensional domain into a similar domain, 
and hence follows the geodesic rigidity mentioned in the introduction. 

Since the geodesics of (27) can be expressed by 


tanh X = cn (+ & Y + C;k) k, C, constants of integration, 
our mappings illustrate certain transformations of the Jacobian elliptic func- 


tions. 


5. Structure of the geodesic group of a Scherk surface. All geodesic 
mappings of a Scherk surface onto itself form a non-abelian, intransitive, mixed 
discrete-continuous group @. Writing the transformations (29) and (30) 
in the form 


(31) y’=i' V+, 
we may describe every element of @ by a symbol ¢ = (¢,7r,7), where ¢ 
denotes one of the bilinear transformations (a,...,), r is a residue class 


(mod 4), and » a complex number; if ¢ is an element from the set (a, b, c, d) 
r can take only the values 0 or 2, and for ¢ belonging to (e, f, g, h), 7 must be 
either 1 or 3. The product of ¢ by ¢’ = (¢’, r’, n’) is given by 


o¢’ = (g¢’,r +1’, i'n’ +n), 
where 9¢’ is the composite of the corresponding bilinear functions. 

The transformations ¢ form a group which is isomorphic to the dihedral 
group D, of order 8. Denoting by &% the group of the linear substitutions of a 
complex variable of the form (31) we find that @ is an invariant subgroup of 
the direct product D, X &. 

The proper combinations (¢, r) of the first two symbols form a non-abelian 
group § of order 16, the “‘finite part of G.’’ It isa normal subgroup of Dy X Gy. 
Its identity element is E = (a,0), and it may be generated by the two 
elements 

P = (e,1), R = (6,0) 
with the defining relations 
P* = R* = (PR)‘ = (P*R)’ = E. 
In fact, § is (4, 4/2, 2) in the notation of Coxeter (4, p. 81; 5, p. 421), that is 
it is the ‘‘rotation’’ group of the regular map {4, 4}2,.0 of four squares covering 


a torus. By naming the 8 edges as in Fig. 1, we can express the generators as 
permutations: 





~~ —_—_— ~~ bee 





GEODESIC GROUPS OF MINIMAL SURFACES 95 


P = (1234) (567 8), R = (15) (37). 




















1 7 
6 8 6 
3 
5 
4 2 4 
1 7 
FIGURE 1 


According to (3) the group § can also be generated by the three elements 
P, R, and Q = (d, 0) with the relations 


P=Q=R=E, PQ=QP, QR=RQ, PR= RPO. 
Every subgroup of @ induces in § a subgroup, and conversely every subgroup 
of § gives rise to subgroups of G; in the same way invariant subgroups corres- 
pond to each other. § contains the following 21 proper subgroups which are all 
abelian (cyclic groups are listed by their generators): 

Order 2: 
Ci = (a, 2), © = (6,0), C2 = (6, 2), 
Ct = (c,0), G2 = (c, 2), C2 = (4,0), C = (d, 2). 


Order 4: There are 7 Kleinian groups D, and 4 cyclic groups C,, viz. 


D=GxXG, D=-GxG, D= Gx &, 

Di=-GxG, D=- Gx, D=- Gx, D= Gx &; 
Gi = (e, 1), = (c3), CG=(f,1), = 1). 

Order 8: 


W=GxG, wW=-GxG, w=-GxGx &. 
The invariant proper subgroups of § are: 
i, oH (= commutator group), C3, 
D:(=centre), D3 OD, wv, wu, wv 
The trivial geodesic mappings form a non-abelian invariant subgroup of @ 
the finite part of which is U*. The product of any two non-trivial mappings’ is 


trivial; in particular the non-trivial mappings may be considered as ‘‘square 
roots’ of isometries. 














96 H. G. HELFENSTEIN 


REFERENCES 


. L. Bieberbach, Lehrbuch der Funktionentheorie, 2, (Berlin, 1931). 

. W. Blaschke, Einfiihrung in die Differentialgeometrie (Berlin, 1950). 

. W. Burnside, Theory of Groups of Finite Order, (Cambridge, 1911). 

. H. S. M. Coxeter, The abstract groups G™””, Trans. Amer. Math. Soc., 45 (1939), 73-150. 

, Self-dual configurations and regular graphs, Bull. Amer. Math. Soc., 56 (1950), 
413-455. 

. H. Helfenstein and M. Wyman, Geodesic mapping of minimal surfaces, Math. Annalen, 
132 (1956), 310-327. 

7. L. Sinigallia, Sulle superficie ad area minima applicabili su sé stesse, Giorn. Mat., 36 (1898), 

172. 





Fon = 


o 


University of Alberta 
and 
University of Ottawa 








> 


\ 











SPECTRAL THEORY FOR A CLASS OF NON- 
NORMAL OPERATORS II 


HARRY GONSHOR 


1. Introduction. In a previous paper (2) we have developed a spectral 
theory and a unitary equivalence theory for a certain class of non-normal 
operators. We dealt primarily with operators which were called J; operators. 
At present we are interested in studying the uniformly closed rings generated 
by such operators. 

The notation will be the same as in (2). 


2. Classification of J, operators. In view of the uniqueness of the pro- 
jection valued measure on Z / Q for a given operator B, operators may be 
classified by the nature of the collection of null sets in the measure. 

(2.1) A pure J; operator is a J; operator whose associated projection valued 
measure satisfied E(Z) = 0. This is the same as saying that the normal kernel 
is 0. 

(2.2) A boundedly pure J: operator is a pure J; operator which has the 
additional property that there exists an a > 0 such that the measure on Q is 
concentrated on the set (A, u, a) where a > a. 

(2.3) A separated J, operator is a J; operator whose associated projection 
valued measure restricted to Q is concentrated on the set (A, u,a@) where 
a > a for a given positive a. (Note that there is no restriction on the measure 
restricted to Z.) 

Even though the concept of a separated operator might seem artificial, it 
will be seen that it comes up naturally in the study of uniformly closed rings. 
The separated case turns out to be simple whereas strange difficulties appear 
in the non-separated case. It will be convenient to annex the points (A, u, 0) 
with \ > uw to Q to obtain the space Q, in dealing with non-separated 
operators. 


3. The abstract spectrum. We are given that B = frxdEQ) where FE 
is a projection valued measure on ZV Q. The abstract spectrum of A will 
consist of points in Z U Q,. 

(3.1) p is in the discrete spectrum «+ E(p) > 0. 

(3.2) p is in the spectrum + every open set containing p has positive 
measure. (Open sets in ZU Q,, are defined in the natural manner, that is, 
unions of open sets in Z and open sets in Q., regarded as a subspace of 
Z@Z@Z.) 


Received March 21, 1957. 
97 











98 HARRY GONSHOR 


(3.3) p is in the continuous spectrum + ? is in the spectrum but not in the 
discrete spectrum. 

It is easily seen that for normal operators the definitions are equivalent to 
the usual definitions. 


LEMMA 3.1. 

(a) Bisa pure J, operator <> spectrum of B C Q.,. 

(b) B is a separated J, operator <> spectrum of BC QU Z. 
(c) Bis a boundedly pure J, operator <> spectrum of B C Q. 


The proofs are trivial. Note that the measure is concentrated on a compact 
set (in fact the set {A} U {A, w, a} where JA}, |ul, a < ||A]|). 


LEMMA 3.2. p is in the spectrum < every set of a complete base of neighbour- 
hoods of p has positive measure. 


LEMMA 3.3. The spectrum is compact. 


Proof. Clearly the spectrum is bounded. Suppose p ¢ K where K is the spec- 
trum. Then there exists an open set U containing p of zero measure. By 
definition U (\ K is empty. This shows that the spectrum is closed. 


LEMMA 3.4. The complement of the spectrum has measure zero. 
Proof. \t suffices to remark that Z  Q., has a countable base for open sets. 


LemMA 3.5. The spectrum K is the unique minimal closed set the complement 
of which has measure zero. 


Proof. Suppose C is a closed set with the above property. Then K’ U C’ is 
an open set of measure zero. Hence K’  C’ C K’. Thus K C C and this 
proves the minimality of K. 


COROLLARY 1. All non-empty open subsets of K in the relative topology 
induced by K have positive measure. 


Corotitary 2. If Lisa null set in K, K —L = K. 


For completion we state two further lemmas the proofs of which are rela- 
tively simple. 


LEMMA 3.6. A compact set K is the spectrum of some operator if and only if 
K(\Q = KN\Q,. 


LEMMA 3.7. Xo ts im the ordinary spectrum of B if and only if the abstract 
spectrum of B contains either do, a point of the form (Xo, X, a) or a point of the 
form (X, do, a). 


4. Uniformly closed rings. The spectrum will play an important part in 
the study of the uniformly closed ring C(A, A*) generated by A and A’*. It 
will be convenient to break up the study of these rings into cases. 








= 6 oO = 6 


eo 35 





he 


to 


ct 


C- 





NON-NORMAL OPERATORS II 99 


Case 1. A is boundedly pure. It can be seen from (2) that the weakly closed 
ring R(A, A*) generated by A and A* corresponds to the set of all L., second 
order matrix functions on Q. By Lemma 3.4 the spectrum of A may be used as 
the domain space. The task ahead of us is to see what happens to C(A, A*) in 
this correspondence. 

Suppose B € C(A, A*). Then B can be uniformly approximated by poly- 
nomials in A and A*. Let B, be a sequence of polynomials approaching B 
uniformly. Then B,(¢) approaches B(t) uniformly except on a null set, hence 
by Corollary 2 to Lemma 3.5, B,(t) approaches B(t) uniformly on a dense 
subset of K. Since B,(¢) is continuous for all n, B(t) may be redefined on the 
null set if necessary so as to become a continuous function on K. The new 
B(t) still corresponds to B. 

Strictly speaking B corresponds to an equivalence class of matrix functions 
on K. However, by Corollary 1 to Lemma 3.5, there is at most one continuous 
function in any equivalence class. Thus we have a well-defined mapping from 
operators in C(A,A*) into the set of continuous second order matrix functions 
on K. It is easily verified that this mapping is norm-preserving and an algebraic 
isomorphism into. Note that some caution is required in the proof since the 
map on C(A, A*) is not just the restriction of the map on R(A, A*). For 
example, to verify that 

|B (¢)||.. = max ||B(?)||, 


the continuity of B(t) must be used as well as the fact that on K, open sets 
have positive measure. Since K is compact it follows from (3) that the mapping 
is onto. Thus we have proved 


THEOREM 1. The uniformly closed * ring generated by a boundedly pure J; 
operator A is algebraically isomorphic and isometric to the algebra of all continuous 
second order matrix valued functions on the spectrum of A. 


By * ring is meant a ring which is closed under the adjoint operation. 
This technique gives an alternative way of proving the corresponding well- 
known theorem for normal operators. 


Case 2. A is separated. It is easy to generalize Theorem 1 to this case. 
K is the union of a compact set in Q with a compact set in Z. Every BE C 
(A, A*) can be decomposed in such a way that B(t) is a continuous second 
order matrix valued function on ¢ restricted to Q, and a continuous complex 
valued function on ¢ restricted to Z. Again using (3) we obtain 


THEOREM 2. The uniformly closed * ring generated by a separated operator A 
is algebraically isomorphic and isometric to the algebra of all functions on the 
spectrum of A which are continuous second order matrix valued on K (\ Q and 
continuous complex valued on K (\ Z. 


Case 3. A is not separated. This case is more difficult than the separated 
case because not all continuous functions on the spectrum are obtained. For 











100 HARRY GONSHOR 


example, if A, u, and (A, u, 0) are in the spectrum, and if f is a uniform limit 


of polynomials, then 
_(fa) 0 ) 
FO, MK, 0) _ ( 0 f(x) . 


Thus the continuous functions that correspond to operators in C(A,A*) satisfy 
certain conditions. 

At any rate, the previous technique remains valid up to the use of (3). 
Thus we have a one-one norm-preserving mapping of C(A, A*) into the set of 
all continuous functions on the spectrum. (It is understood that the values are 
matrices of order two on Q,, and complex numbers on Z.) It is clear that all 
functions g in the image have the following property when restricted to 
(Q.. — Q) U Z : There exists a continuous function 4 on the complex numbers 
z such that g(A) = A (A) for all A in the spectrum and 


gQ, #, 0) = re 2 


for all (A, w, 0) in the spectrum. It will be shown that this condition is also 
sufficient for a continuous function to be in the image. 

Let g(t) be any continuous function satisfying the above condition. Choose 
a polynomial function f(#) which satisfies |g(¢) — f(t)| < }« for all ¢ which 
either are in the part of the spectrum which is in z or have the property that 
(A, t, 0) or (¢, A, 0) is in the spectrum for some X. It follows immediately that 


llg(t) — f()|| < 4e, ¢€ (0. — Q) UZ. 


By uniform continuity we can even guarantee that ||g(¢) — f(t)|| < }¢ for 
all ¢ in the spectrum except possibly those of the form (A, u,a@) for a >a 
for some positive a. Now consider the set of all (A, u4,a@) in the spectrum 
which satisfy a > $a. There exists a polynomial A(#) which satisfied 
\lg(t) — h(2)|| < 4e for all such a. By the same technique as in (3) it may be 
shown that there exists a positive definite self-adjoint matrix valued function 
e(t) satisfying ||e(¢)|| <1 for all ¢ which is a uniform limit of a sequence of 
polynomial functions, and satisfying e(#) = 1 for all (A, u, a) such that a >a 
and e(#) = 0 for all ¢ € Z and all (A, u, a) such that a < 4a. Note that even 
though separation of points fails to hold in the spectrum, the proof goes 
through since separation fails only among points in Q,, — Q and Z. Consider 
eh + (1 — e)f. It is easily verified that 


llg — [eh + (1 — e)fl|| < 
for all t. This completes the proof. 


THEOREM 3. The uniformly closed ring generated by a J2 operator is alge- 
braically isomorphic and isometric to the algebra of all functions g on the spectrum 
K of A which are continuous complex valued on K (\ Z and continuous second 


order matrix valued on K (\Q.,, and which have the additional property that there 





~\ DD hn o|!;, 


Ss eaerFsns£ SS F&F Ww SS SS S&S TF 


d 





NON-NORMAL OPERATORS II 101 


exists a continuous complex valued function f such that g(t) = f(t) for all t in 


K (\Z and 
_ (f *) 
g(a, u, 0) —_ ( 0 f(u) 


or all (Xr, p, 0) in K (\ Q,. 


This theorem which generalizes the previous theorems illustrates the funda- 
mental distinction between separated and non-separated operators. 

We conclude this section with certain remarks concerning the equality of 
C(A, A*) and R(A, A*). Further details are found in (1). The equality 
R(A, A*) = C(A, A*) is equivalent to the statement: Every open set in 
spectrum A differs from a closed open set by a null set if A is separated. 
Otherwise, R(A, A*) is never equal to C(A, A*). 


5. A counter-example. This section is independent of the rest of the 
paper. 

It is known that all operators A satisfying A* = 0 are J; operators. (A proof 
may be found in (1).) However, the corresponding statement for operators A 
satisfying A" = 0 is not valid. In fact, we give an example of an operator A 
satisfying A* = 0 on a space of Xp» dimensions which is irreducible. 

Let ai, bi, bo... bn... C1, Co... Gy... De a basis of the Hilbert space H. Let 


Aa, = bi, Ab; = C1, Abisi =¢,+ Ci+1 1 > l, Ac, = (). 


Clearly A is bounded and A* = 0. Suppose H = E @ F where E and F reduce 
A. We show that either E or F must be H. 


Lemma 5.1 If 41a; + dud; + Dvycy € E and N11 + du’ by + Dov’ scy € F 
then either \, or \'; = 0. 


Proof. Let A*® operate on each. Then Ac; € E and \’yc, € F. Hence A, or 
’, = 0, otherwise E and F would not be orthogonal. 


It follows that E or F is orthogonal to a;. Thus a, € E or F. Suppose without 
loss of generality that a, € E. Then 6; = Aa, € Eandc, = Ab; € E. 


LEMMA 5.2. be € E. 


Proof. Suppose 


Aobe + pe Abs + ) uc, € F. 


i>2 


Then 
Al vain + Ed. + X wer] € F 
i>2 


In the latter expression the coefficient of c; is \2. Since c, € E, Ax = 0. Since 
this is true for all points in F, b2 is orthogonal to F and is hence in E. 











102 HARRY GONSHOR 


Now ¢:; = Ab; — c; € E. By induction all the 6’s and c's are in E thus 
showing that E = H. 


REFERENCES 


1. H. Gonshor, Spectral theory for a class of non-normal operators, Ph.D. thesis (Harvard, 1953). 

2. , Spectral theory for a class of non-normal operators, Can. J. Math., 8 (1956), 449-461. 

3. I. Keplensky, The structure of certain operator algebras, Trans. Amer. Math. Soc. 70 (1951), 
219-233. 





Pennsylvania State University 











b 








ON THE CONVERGENCE OF MEAN VALUES 
OVER LATTICES 


WOLFGANG SCHMIDT 


Introduction. Recently C. A. Rogers (2, Theorem 4) proved the following 
theorem which applies to many problems in geometry of numbers: 

Let f(X1,X2,..., Xx) be a non-negative Borel-measurable function in the 
nk-dimensional space of points (X1,X2,...,X,). Further, let Ao be the funda- 
mental lattice, Q a linear transformation of determinant 1, F a fundamental region 
in the space of linear transformations of determinant 1, defined with respect 
to the subgroup of unimodular transformations and yu(Q) the invariant measure’ 
on the space of linear transformations of determinant 1 in R,. Then, if 1 ck 


<qn-—l, 
J, Ze Xe) dul) = f0 iswd 0+ f... fre, aca X,) 


(1) dX,...dX, 

es) N D, n m d m d 

+O ¥E > (MO) Sf... fx fox,..., £ fox, 

Cm ql D q imi iml 
oe 
both sides having perhaps the value + ~. The outer sum on the right side is 
over all divisions (viz) = (¥1,...,%mi iy -- +» Me-m) Of the numbers 1,2,...,k 

into two sequences v1,..., Vm and ,..., em Wh 1 Cmqk—1 
Lana tean£.... ta4e [om Cm € ss Seine 

(2) 


Vi ~ py, l<igm;l1<j<ck—m. 


The inner sum is over all m X k-matrices D with integral elements, having 
highest common factor relatively prime to q, and with 


div, = Gbiy, lL<icgm;l1<j<m 
(3) 
dy, = Oif wy <r lL<icml<j<k-—m. 
Finally, N (D,q) is the number of sets of integers (a,d2,..., Gm) withO Qa,<q 
and 
. Lo diya, = 0 (mod g), l<i<k. 
i=] 


Received January 29, 1957. 
'F and the invariant measure are defined in (5). 


103 











104 WOLFGANG SCHMIDT 


Rogers (2) wrote 
N(D, q) 


—.. 2 instead of - 
q 


where e; = (€;,¢) and «,..., €, are the elementary divisors of D. By Lemma 
1 of (2), 
4: ém _ N(DQ) 
se 
Another proof of Rogers’ theorem is given in (4). 
We write (pic) < (vx) if 


(p;c) = (p1, «+e Pm;s Tipe e ey Cr—m)> (vs) = (v1, oo + 9 Vy Miy- + sy Me—m) 
and p1 = 1, p2 = v2,..., Pi-1 = ¥i-1, pi: < v; for some 1 < m. If m < k and 
D is a m X k-matrix, then we denote by D(v;u) the square submatrix with 
columns 7,¥2,...,¥%m and by detD(v;u) the absolute value of the deter- 


minant of D(v;y). 
In this paper we prove two theorems: 


THEOREM 1. Rogers’ theorem remains true, if (3) is replaced by 


D(vin) = 
(4) detD(p;0) < detD(v;u) for any (pic) = (p1, ~~~ 5 Pmi T1y ++ + » Teom) 
detD(p;c) < detD(viu) tf (pic) < (vp). 


Theorem 1 provides better estimates for the sum in (1), since (4) permits 
only matrices D with |d,,| < g. We further prove 


THEOREM 2. If f(X1,...,Xx) is bounded and vanishes outside a bounded 
region of space, then both sides of (1) are finite. 


Theorem 2 is an improvement of Rogers’ result, that (1) is finite, under the 
stated conditions, if m > [}k*] + 2. Theorem 2 guarantees finiteness for all 
cases of Rogers’ theorem, that is, for k < n. No results are known? for n = k 
orn <k. 


1. Lemma 1. If f(X;,..., Xz) > 0, then 
Mes... ae 


XyeA 


a 
=f(0,...,0+ > on rag, 7 


2 ee és 

6) +DUrXeLXl dmv... ¥.) = yor... 
>» di; Y,/g€ A > de y,) 
= 


The sum extends over all D which satisfy (4). 


*In 2 Theorem 5, m > 2 should be replaced by nm > 2. 














MEAN VALUES OVER LATTICES 105 
Lemma | is analogous to a formula of Rogers, where D satisfies (3). 


Proof of Lemma 1. The set of points 0,0,...,0 occurs in the term 
f(0,0, ..., 0) and if X,,..., X, are linearly independent, then f(X,, ... , X:) 
will occur just once in the sum 


Xi. ++ Xe€ A 
p> fils -nolet & = ee 


If0 < dim(X,,...,X,) = m < k, then X;,..., X, span an m-dimensional 
space S. If (vu) = (v1,...,%mi Miy-++>Meom), then let d(viu;X3,...X,) 
be the volume of the m-dimensional parallelepiped spanned by 


= err 
There exists a uniquely determined (v;u) so that 


d(pio;X1,..., Xx) < d(vju;Xi,..., Xx) for any (p30) 


’m* 


and 
d(p;o;X1,..., Xn) < d(vsysXi,..., Xx) if (psc) < (wy). 


Every point X, can be expressed uniquely in the form 


m m 


X;,= } Cy X>, = x wx, 


i=l i=l 


where c,, are rationals and d,,, qg > 0 are integers so that the highest common 
factor of the d,, is relatively prime to g. Clearly, D = (d,,;) and q satisfy (4). 
Further, if we take 


Y,=X,, 
then Y;,¥2,..., Ym are linearly independent points of A, and the points 
m d - 
X; = > 14 Y, 
imi 


are points of A. Consequently, there is a term 


AE Syz,....= 48 y,) = f(Xu....%) 


imi @ i=! 
in the sum (6), corresponding to the points X;,...,X;,. It is clear that 
(v;3z), g and D are uniquely determined by the points X;,X2,..., X,. So 
corresponding to each k-tuple of points X,,X2,...,X, there will be just 


one term in the sum (6). 
Conversely, it is easy to see that each term in (6) corresponds to just one 
term in (5). Since f is non-negative, Lemma 1 follows. 


Proof of Theorem 1. We make use of the following theorem of Rogers 
(2, Theorem 3): 














106 WOLFGANG SCMHIDT 


Let f(X1,...,Xm) be a Borel-measurable function which is integrable in the 
Lebesgue sense over the whole X,,..., Xm-space. Then the lattice function 
Xi, eeey pe € A 


f(A) = Do] dim(Xi,...,X,) =m | f(X1,..., Xn) 
Dy dis/qXceA 


ts Borel-measurable in the space of lattices of determinant 1 and 


J soos du(Q) = (NQa) f ws f 1%, oop dem) GX,... 0X. 


A combination of this theorem and Lemma 1 gives (1), where the sum 
is extended over all D with (4). This proves Theorem 1. 


2. Some properties of systems of linear congruences. There exist 
many papers on this subject (for example, (1)), but it seems desirable to 
develop the theory in a way which is most suitable for our purposes. 


Let 4,02, ..., , be integral vectors, p‘ a power of a prime. We define the 
rank r(p‘) of a1,02,...,Q_, (mod p‘) to be k, if there exists a subset R of k 
vectors 

ee 


so that each vector is (mod p‘) a linear combination of vectors of R, and if 
k is the least integer with this property. We say R is a basis of a1,02,..., Om 
(mod p‘). If H is an integral matrix, then we define the rank r(H,p‘) to be the 
rank r(p*) of the rows of H. 

We investigate the set H(u,v;ri,re,...,71;p) of matrices H which have u 
rows, v columns and r(H,p’) = r,; (1 <j < 2). If 


H € H(u,v; ri,r2,..., 723), 


then there exist bases R, of all rows (mod p’), consisting of r, rows. R, has 
(mod p*") rank sy1 <ry.1 and a basis S,;. (mod p*") consisting of sy; 


rows. (S;1 < 7;-1 follows from the fact that if vectors b;, ... , 6, are linearly 
dependent on ¢;,...,¢, (mod p‘), then there is a subset of s vectors 
ee = 
so that b;,..., 6, are linearly dependent on 
bi,,---,» bs (mod p‘). 


This fact can be verified similarly to the corresponding proofs in vector- 
algebra.) Each row is a linear combination of rows of R,; (mod p’), hence a 
fortiori (mod p*'). Consequently, S,; is a basis R;. (mod p*') of all 
rows of H and we have sj: = ry-1, Ry1 CR, If therefore H € H (u,v; 
11,...,% 3p), then there exists a sequence of bases R, (mod p’), each con- 
sisting of r, rows and with Ri CR: C...CR,. 





~~ ® ~—~ pa 








MEAN VALUES OVER LATTICES 107 


We define G(u,v;71,72,...,7.;p) to be the subset of those H € H(u,v; 
ri,f2,..-,9%3~) which have a sequence of bases R; C ...C R, so that R, 
consists of the first r, rows 


Hide, “*e*9 b, 


of H(l<j<t). If Nag(uin,...,rup) is the number of H € A(u,v;1, 
-» ab) (mod p*‘) and Neg(u,v;n, ...,7.;p) is the number of H € G(u,v;7;, 
» Tip) (mod p‘), then 


i 


(7) Nu(uvsn,..., 7p) < u! Ne(uin, ..., rep). 
LemMA 2. If H € G(u,vjri,...,7ip) has the rows b1,b2,..., Ox, and if 


(8) b 


a, bc, (mod p’) 
{_ 
then there exist d; (1 < i < 1;), so that 


(9) b 


Z. bd; (mod p’) 0<d,<p’*ifi>r,, 
i=1 
where 1 ce <j < t+ 1; we write ro = 0, roi = u. 
Proof of Lemma 2. The lemma is true if 
C,341 > Cri 42 =. @ Cr; = 0. 
Using induction on f, we assume it to be true for 
(10) Crptt = Crna =... = C,, = 0 
and prove it for 


(11) Cogaitl = Cosuiez2? =~... = Cy; = 0. 


If (11) holds, then 
rf+i 


b= Lu bic, (mod p’). 


If ry <i < rps, then 
rf 


5}; = oe wii + ba: (mod p’). 


Therefore, 
rs Ty +i r 
b= yi Dice +2 { > Widi + base, (mod p’). 
t= Ts -_ 
If we take dy = c, (mod p*’), 0 < dy < p*! (r7 + 1 < i < rps), then 
rf+i 
(12) h=p’ + 2D bedi (mod p’) 
. 


where b’ can be written in the form (8) with (10), whence by induction in the 
form (9). Therefore and by (12) we proved Lemma 2 for all h with (8) and 
(11). 











108 WOLFGANG SCHMIDT 


LemMMa 3. 
(13) Ng(u,vsri, Say rip) < ulperr int ooo + el—ry29—r—q2@— 2 2. = re 
Proof of Lemma 3. Because of (7) it suffices to prove 
ins. marae es 


If bi,..., 5, are the rows of H € G(u,viri,..., rip) and if r; << s < ry, 
then 5, can be written in the form (8), hence by Lemma 2 in the form (9). 
There are 


(p’)"(9"*) act es _ priteet sos Oey 
possibilities for the coefficients d. If therefore §,, . . . , 6,1 are given, we have 
p"'*+ ---+*" possibilities for }, (mod p’), times p‘~”* possibilities if we fix }, 
(mod p‘). This gives p”'* - - - +"#+(* possibilities. Hence, 

Nduar.....71) <9". geretaera . 
gpreteatte-Satey-e0) gint - » re) (u—re) 


— ed ooo $rg)—ri*—.. . —re* 


LemMA 4. If Zy(u,v;ri,...,7:ip) ts the maximal number of solutions (mod 
b*) of an equation 
(14) bixr +... + bux, = 0 (mod 9'), 
where }:,b2,..., 0. are the rows of a matrix H € H(u,vjni,..., 7p), then 
(15) ZAnat.... 1) <7""°°**™. 


Proof of Lemma 4. It is enough to prove (15) for H € G(u,vjn,..., rip). 
First we choose 


arbitrarily. This gives p““-"® possibilities (mod p‘). The number of solutions 
of (14) with fixed 


Xrp4+l ,* ee ge x Cu 


is at most equal to the number of solutions of the homogeneous equation 


(16) Bix — eee a Dy Xr, = 0 (mod p'). 

Since h;,...,h,, have rank r, (mod p‘), all x; have to be multiples of p, 
that is, x, = py, (1 <j < r,). Hence we have the new system 

(17) bin +... + bre, = v (mod p*"). 


System (17) is similar to (14), we only substituted r, for v, ¢ — 1 for ¢. By 
repeated application of this argument we see that 


t(u—re)+(t—1)(re—re-1)+ . . . +€re—r1) tw—fi— . « » =? 
Zu(u,vi7i,..-, 7p) <p" sane ee a 


If 











MEAN VALUES OVER LATTICES 109 


then we define the set of matrices 


TirP ity. ++ 5 Trey 
To1,% 22) - ++» Tees i 

U,V; cea q@] = H(u,vipiq) = n A (uv a,.. + TieiPa- 
PT its s+ +> Trey - 


Let Ny(u,v;p;q) be the number of H € H(u,v;p;¢) (mod g) and Zyz(u,v;9;¢) 
the maximal number of solutions (mod q) of 


(18) biti +... + Bux, = 0 (mod gq) 
where 5;,52,..., 6, are rows of an H € H(u,v;p;q). 

We observe 
(19) Ny(u,v;0;q) = I Ny(tvira,..-, TeiPd 
and 
(20) Zy(u,v;9:9) = I Zy(Uvi7 a, - ++» TresiP1)- 


3. Proof of Theorem 2. If f(X:,...,X,) is a non-negative Borel- 
measurable function, then (1) holds. We are going to show that if, in addition, 
f is bounded and vanishes outside a bounded region of space, then both sides 
of (1) are finite. 

There is only a finite number of divisions (v;4). Hence it suffices to prove 
the convergence of the sum for a given (v;u). Finally we observe that, under 
the stated conditions, the integrals 


(21) - fo x,..., > ®X)dXs... Xn 


i=l 


are less than a fixed constant. Therefore it remains to show the convergence 


of 
N(D,q)\" 
Ey (Za) 
ql D q 
where (v;u) = (¥1,...,¥mi Miy--+-»Me-m) is given and D runs through all 


matrices satisfying (4). D has m rows, k columns. 
If D € H(m,k:;p;q), 
I 
q= I] pi", 
then, by definition of N(D,q), N(D,q) < Zg(m,k:piq). How many matrices 
D in H(m,k;p;q) satisfy (4)? Since the columns 


er 


are fixed and = 0 (mod gq), there are < Ny(m,k — m;p;q) possibilities modulo 
q and because of |d,,| <q at most 3%*-"N,(m,k — m:;p;q) possibilities. 











110 WOLFGANG SCHMIDT 


Consequently, by (19) and (20), 


5 (MDa) ¢ sme py (Zarda 5 trait) 
D q i=I Pi 


Nyu(m,k — m;ru,..., Ties Pi). 


The summation is taken over all D € H(m,k; p; q) which satisfies (4). 
By summation over all g,p, we obtain 


= "(Day)" mtn) | 

(22) u > ( q” * I] ded 

(Zales : 

o™ 

The sum on the right hand side of (22) is over all sequences 1 < r; < re < 

... <r_- < m with arbitrary c. We have r; > 1, because r; = 0 would imply 

that all elements of D are multiples of », and p, D were not relatively prime. 
It is a consequence of (13) and (15) that 


(Zaher ES ss 


leri<. . . <te<m 


2112) Nam at.) ee rb) | . 


- reb)) Nuy(m,k — m3r,...,TeiP) 


(23) p™ 
< (I) (k —_ m)\p*\""* ~ + + $%e)—1742— - . —Pe* 
_ (k = aig ror - + + $fe)—ry2%—... ~—P 

We have 


—(n—k) (rit... +re)—ri2@—. . . —re? _ - — —[{(m—k) 1+ 12) ¢ 
(24) l<rni< > a IT (= p ) 
=F . I] (1+ ap er) ~—Y. Cpe 


where C is a constant. Finally, the product 


fy (1+ S@=s)e-2) 


n—k+1 
> Pp 


is convergent. This fact, together with (22), (23) and (24), yields Theorem 2. 
By estimates for the integrals (21), provided by (3), it would be posible 
to find good bounds for (1). 


REFERENCES 


1. A. T. Butson and B. M. Stewart, Systems of linear congruences, Can. J. Math., 7 (1955), 
358-368. 

2. C. A. Rogers, Mean values over the space of lattices, Acta Math., 94 (1955), 249-287. 

3. , A single integral inequality, to appear in the Journal of the London Math. Soc. 

4. W. Schmidt, Mittelwerte iiber Gitter, Monatshefte f. Math., 61 (1957), 000-000. 

5. C. L. Siegel, A mean value theorem in geometry of numbers, Annals of Math., 46 (1945), 
340-347. 





Montana State University 











A GENERALIZED TAUBERIAN THEOREM 
F. R. KEOGH anp G. M. PETERSEN 


Let {s(m)} be a real sequence and let x be any number in the interval 
0 <x < 1. Representing x by a non-terminating binary decimal expansion 
we shall denote by {s(m,x)} the subsequence of {s(m)} obtained by omitting 
s(k) if and only if there is a 0 in the kth decimal place in the expansion of x. 
With this correspondence it is then possible to speak of ‘‘a set of subsequences 
of the first category,’’ ‘“‘an everywhere dense set of subsequences,’ and so 
on. 

Suppose that 7 is a regular summability transform given by the matrix 
(dmx) and let t(m,x) = ¥ Gnns(n,x). In a previous note (3), extending a 
theorem of Buck (2), we proved that a real sequence {s(m)} is convergent if 
there exists a JT which sums a set of subsequences of the second category. 
Our object now is to generalize this Tauberian theorem to the following: 


THEOREM. Suppose that {s(m)} is a real sequence and there is a T such that 
lim sup t(m,x) — lim inf t(m,x) < « 
in a set of the second category. Then 


lim sup s(m) — lim inf s(m) < «. 


The possibility of such a generalization of a Tauberian theorem has been 
pointed out by Bowen and Macintyre (1). 

We first show that, under the hypothesis of the theorem, {s(m)} is bounded. 
Suppose, on the contrary, that {s(m)} is unbounded. In (3) we proved that 
when {s(m)} is unbounded then, on the one hand, if (a,,,) has infinitely many 
rows of finite length, lim sup ¢(m,x) — lim inf t(m,x) is finite only in a set of 
the first category and, on the other hand, if (a,,,) has only a finite number of 
rows of finite length, {s(m,x)} is in the domain of T only in a set of the first 
category. In either case we have a contradiction and it follows that {s(m)} 
is bounded. We may now prove the conclusion of the theorem with the added 
hypothesis that {s(m)} is bounded. Under this hypothesis, by the following 
lemma, we may further assume that (a,,,) is row finite. 


LeMMA 1. Given a regular transform T with matrix (dm,) we can find a trans- 
form with a row finite matrix (Gm,') such that, for every bounded sequence |s(n)}, 


XL ams(n) — LX ahns(n) > 0. 


Received June 7, 1957. 
111 











112 F. R. KEOGH AND G. M. PETERSEN 
Suppose that |s(m)| < K. For each m choose k, so that 


@ , , 1 
2 lame! < m 
and define dn’ = Gm, for nm < Rm, Gm,’ = 0 for n > kn. Then 


<*o 
m 











DX amns(n) — S> aans(n) 


We next prove 


> OmnS (1) 


LemMaA 2. Let {v(m)} be any sequence of 0's and 1's containing an infinity of 
both 0's and 1's. Then for any integers p, N and any regular row finite matrix 
(Gmn), there is a subsequence {v(j,)} such that 


(i) lim sup , Ana? (jn) — lim inf >> anad(jn) > 1, 
(ii) jp > N. 
The subsequence described in the lemma is obtained in the following way. 


Writing an, = a(m,n), let a(m,N(m)) be the last non-zero number in the 
mth row of (a,,). We first choose m, so that 


N(m)) 


> a(m,n) — 1 | <1 





and start the subsequence with N(m,) 1's. We then choose m2, with 
N(mz) > N(m,), so that 


N(m)) 


> a(mz,n) 
n=1 


and continue the subsequence with NV (m2) — N(m,) 0's. At the kth stage, if 
k is odd, we choose m, so that N(m,) > N(m,_;) and 


< $, 








' 


N(m)) N(m3) N(ms5) N(m,) 1 
("= + a + p » +...+ > i Jotmn) —1| <4, 
n=1 N(m2)+1 Nim,)+1 N(m,~—1)4+1 


We then continue the subsequence with N(m,) — N(m,_,) 1's. If k is even we 
choose m, so that N(m,) > N(m,_;) and 
( N(m}) N (m3) N(m,_—1) 


1 
> +...+ oe ecm) | <3. 
1 
We then continue the subsequence with N(m,) — N(m,_,) 0’s. The possibility 
of this construction is ensured by the facts that, (a,,,) being regular, 





n=1 N(m)+1 N(m,—2)+ 


lim Zz Gun, = 1, lim a, = 0. 


Plainly the subsequence {v(j,)} so constructed satisfies the inequality (i). 
It is obvious, moreover, since {v(m)} contains an infinity of both 0’s and 1’s, 
that given any integers p,N, we may choose j, so that j, > N, the inequality 


(ii). 











A GENERALIZED TAUBERIAN THEOREM 113 


We now proceed to the proof of the equivalent of the theorem: if 
lim sup s(m) — lim inf s(m) >, 


then lim sup ¢(m,x) — lim inf t(m,x) < « only in a set of the first category. 
We prove first that the set D of x such that lim sup ¢(m,x) — lim inf t(m,x) 
> « is everywhere dense. 

Let lim inf s(m) = L, lim sup s(m) — lim inf s(m) = H and define 


(1) als) = 3 (st) — L). 


Then lim sup u(m) = 1, lim inf u(m) = 0 and we can choose two subsequences 
{u(Rn)}, {u(p,)}, such that lim u(k,) = 1, lim u(p,) = 0, and k, # p, for all 
i, j. Let {u(t,)} be the subsequence of {u(m)} obtained by combining these 
two subsequences, arranging them so that the suffixes are in ascending order. 


Now let {v(z,)} be defined by v(z,) = 1 if 7, = k, for some j, v(i,) = 0 if 
tn = p, for some j. Then 
(2) lim (v(7,) — u(i,)) = 0. 


By Lemma 2 (i), since it is a sequence of 0’s and 1’s and contains an infinity 
of both 0’s and 1’s, {v(z,)} has a subsequence {v(j,)}, say, such that 


(3) lim sup >> Gmad(jn) — lim inf >> angv(j,) > 1. 

By Lemma 2 (ii), moreover, given p and any subsequence {s(g,)} of {s(m)} 
we may choose j, so that j, > g,-1, and then {s(r,)} = s(qi), s(@2), . . . S(@p—1), 
S(jp), S(jp+1), ...i8 a subsequence of {s(m)}. By varying {s(q,)} and p, we 


obtain an everywhere dense set of subsequences, whose representative points 
will be shown to lie in D. In fact, since 


lim Gn, = 0, 


we have by (1) and (2) 
lim sup } » AnnS(T,) = lim sup > OmnS(jn) = lim sup > Omn(Hu(j,) + L) 
lim sup >> dma(Hv(j,.) + L) 
and similar equalities with lim sup replaced by lim inf. Thus, by (3), 
lim sup p OmnS(T,) — lim inf yo OmnS (Tn) 
= H(lim sup >> anav(jn) — liminf >> anav(jn)) > H > «. 


Finally, let S,*, (k = 1,2,...; 2 = 1,2,...) denote the set of x such that 
there exist u,v > n for which 


|t4(x) — tx) | > - 5. 


Since (dm,) is row finite, S,* is obviously open and, since it contains D, it is 
everywhere dense. If 














114 F. R. KEOGH AND G. M. PETERSEN 


then 


lim sup t(m,x) — lim inf t(m,x) > « — : 


for all k and so lim sup ¢(m,x) — lim inf t(m,x) > «. The set of x for which 
lim sup ¢(m,x) — lim inf t(m,x) < « therefore belongs to 
UU ¢S 
k=1 n=l 
and so is of the first category. 
We wish to thank the referee for his comments and suggestions. 


REFERENCES 


1. N. A. Bowen and A. J. Macintyre, An oscillation theorem of Tauberian type, Quart. J. Math., 
2 (1950), 243-247. 

2. R. C. Buck, A mote on subsequences Bull. Amer. Math. Soc., 49 (1943), 898-899. 

3. F. R. Keogh and G. M. Petersen, A universal Tauberian theorem, J. Lond. Math. Soc. (to 
appear). 


University College of Swansea 








to 





ON THE EXISTENCE OF THE 
BURKILL INTEGRAL 


H. KOBER 


1. Introduction. The main problem of the present paper is the existence 
of the Burkill integral of an interval function f(J) which is not supposed to be 
continuous. Little is known about this case, though otherwise the theory of 
the integral can be considered as complete: we may refer to Ringenberg's 
comprehensive paper (2) in which further references are given. 

We shall deal with the problem by introducing the notion of infinitesimal 
additivity and will show that the indefinite integral can be continuous even 
when f(J) is not. Finally we apply the main result to the generalised arc 
length of a curve; the result appears not to be known even with respect to the 
familiar notion of arc length. 

Let R(O < x; < Ay; j = 1,2,...,m) be a fixed interval in the Euclidean 
space Eu,(m > 1), and let an interval function f(J) be defined for any closed 
interval 

I (a1; < Xs S 2330 < iy <M day < A;) Cc & 


Any f(J) is supposed to be finite for every J C R. The following result is known 
(2; 3, p. 168). 


THEOREM 1. Jf (i) f(J) increases by subdivision (abbreviation f(I) C SA), 
(ii) the upper Burkill integral of |f(I)| over Ris finite, and (iii) f() is continuous 
on R, then its Burkill integral over R and, therefore, over any I C R, exists. 


We replace the condition of continuity by a much weaker one. 


THEOREM 1’. (a) Theorem 1 holds when the condition (iii) is replaced by 
(iii’): f(1) is infinitesimally additive on R (see 2.1). 

(b) When R is the linear interval (0,A ) and f(I) is subadditie, then f(I) 
is Burkill integrable if and only if (i) its lower Burkill integral is bounded above 
and (ii) f(Z) is infinitesimally additive on R. 


Again for non-continuous f(I), {; f can be continuous (§ 5). 


2. Some additional definitions and notations. A representation of an 
interval J in the form J = J, +...+ I, = } J; is said to be a subdivision 
S of I, or S(J). The J,'s are always required to be finite in number and not 
to overlap. We write f(S) for }-f(J;); ||J,|| for the diameter of J,;||G|| for max 
\|Z,||(k = 1,...,m). When the J,’s are arranged in rows and columns 
(m > 2)S is said to be a mesh-division. Any finite number of non-overlapping 


Received May 7, 1957. 
115 














116 H. KOBER 


intervals form a figure, denoted by F or }>J,. While J is closed, J° is open; 
|| is its Lebesgue measure, in Eu, for example, the area of the interval. 
The upper, lower Burkill integral; the Burkill integral, respectively, of 
f(T) over IC R are denoted by U;f, Lif; frf. The existence of the latter 
integral is meant to imply its finiteness. 
If, for any JC R, 


S(D < fUy) + fs) (J = 1,4+ 12), 


then f(J) is said to be subadditive (f(J) ©€ st); if, for any S, f(J) < f(S), 
then f(J) € SA. Clearly SA C st; and SA = st in Eu,. 

By « we denote any oriented interval of m — 1 dimensions such that 
i? C R°(n > 2), while in Eu,, 7 is any point of R® = (0, A). Thus in Exs, i 
is any line segment parallel to one of the axes which does not form part of 
the perimeter of R and has no point outside R. When n = 2 and both end- 
points of i, or m = 3 and all the sides of i lie on R — R°, etc., we use sometimes 
the notation 7*. A function f(J) is said to be infinitesimally additive if for 
any fixed i 
2.1 fy) + fl) -— f(D) - 0 J=h+1:C R) 


whenever J,J, = i and |J| — 0. An interval 7* is said to be irregular if there 
is at least one sub-interval i C i* for which 2.1 does not hold. We remark 
that, if f(Z) is of bounded variation over R (abbreviation: f € V; V;f = total 
variation of f over J), then the limits of 


fi), U2), f(D (é fixed, Ti|J2 = i, [i + Is] = [| 40) 


exist, and the irregular 1* are countable. 


3. Some lemmas 


LemMaA 1. If Sef exists then, given « > 0, there is a 6 > O such that whenever a 
figure 
> IkCR and max ||J,|| < 5 (k = 1,2,...), 


3.11 | E fn) - fa < te 
3.12 > ich) _ Ja <e. 


While 3.11 is known (2; 3, p. 167), 3.12 is deduced from it by considering 
separately the intervals J, for which the corresponding differences occurring 
in the sum in 3.11 are > 0 or < 0, respectively. 

We proceed to some elementary existence theorems. 


LEMMA 2. The integral Saf exists if, and only if, given « > 0, there isai > 0O 
such that for subdivisions S,,S_ of R 


3.21 If(S1) — f(S2)| <«, (|S (R)|| < 8,7 = 1,2); 











ON THE BURKILL INTEGRAL 117 


or if 

3.221 | ry) ~_ > fad} <6 
or , 

3.222 > |f(as) - > f(In)| <« 


whenever R = >> J;, max||J;||< 6 and J, = Yu. 

Part 3.21 is trivial. The necessity of the condition of 3.221 follows from 
it; so does the sufficiency, when ©, is taken as }-J, and >, .,J,; as a sub- 
division both of S,(R) and of S2(R). Hence the condition 3.222 is sufficient 
also. Its necessity is deduced from 3.12 by the additivity of the Burkill integral. 


LemMMA 3a. When f(I) € SA the integral exists if, and only if, (i) Laf < @, 
and (ii) given « > 0, there is a 8 > O such that for any figure F = > I, C R, 
with ||F\| <6 and Iy = Ini + Ins, 


(3.3) p> (f(a) + f(s) — fU)} <e. 


LemMA 3b. In Ey, (ii) can be replaced by the weaker condition (ii') f(I) is 
infinitesimally additive (see 2.1). 


Proof af Lemma 3a. The necessity of the condition follows immediately 
from 3.221. To prove the converse we may consider Eu, form = 2 only. Clearly 
Lef > f(R) > — ©. We show that, for any S(R), f(S) < Leaf; which 
implies that 


Unf < Leaf, Un = Le = Saf. 


Since f(J) € SA we may for convenience suppose that S(R) (R = >-J;) 
be a mesh-division. Fixing « and 6 according to (ii), we find an S6*(R) (R = 
X<J,) such that f(S*) < Leaf + «, and that ||/S*|| is smaller than 6 and the 
sides of each J;. Denote the J,’s that lie in one, two or four of the J, by J,, 
I, or I,, respectively, and set 


I, =Iat Ie, I, = Int... t+ Lr, 
where each J,;, ,; belongs to one J, only. As f(J) € SA, 


(S)< Drs)+ LX > fa) + >» >> fUn) 
= p> fk) + > 12, fq) ~ sot + p> ‘ D> fn) 


— fn + In) — fia + I) +> (Un + In) 
+ f(Is + In) — f(,)} 
< f(S*) + Be < Lyf + 4 


by (ii). Taking « — 0 we complete the proof. 











118 H. KOBER 


Proof of Lemma 3b. We proceed as before, but observe that in Eu, the 
I,'s are either of the type J, or J, and that the number of the J,'s is less than 
N, the number of the J,’s. Taking a suitable S* we show that, given « > 0, 


f(S) < Lef +e + Ne; 


therefore, Uz = Le, so that the integral exists. Conversely, to obtain 2.1 
we take F = ] > i and observe that ||F|| = |J| in this case. 
Finally we state two results which are not difficult to prove. 


If f(T) € SA and Upf is finite and Lyf additive, then f xf exists. 
In Eu, if Upglf| < © and f(I) is infinitesimally additive, then both Uif 
and Lif are additwe. 


4. Proof of Theorem 1’. Again we show that (iii’) is not a necessary 
condition. 

Part (b) follows from Lemma 3b. To deal with (a) we take m = 2. We 
have to show that, for any 


S(R) (R = Vh), f(S) < Laf; 


we may suppose that © be a mesh-division since f(J) € SA. Let 4,*,..., 4,* 
and ty4:*,...,tv4r* be the lines generating S, parallel to the x, and x, 
axis, respectively. If the variation of f(J) on R is zero at each of these lines 
then, given « > 0, we deduce by a known argument (3, pp. 166, 168) that 
f(S) < Leaf + 3 and take « — 0. Suppose now that the variation of f(Z) on 
R is not zero at i,*, say. Let J,,J3,..., I2r41 be the intervals lying between 
x2 = O and 7,*; J2,J4, ..., T2742 between i;* and 7,*. We can draw 7’ between 
x2 = 0 and i,*, 7” between 7,;* and i,*, both lines parallel to i,* and arbitrarily 
near it, and such that the variation of f(J) on R vanishes at 7’ and 7”; these 
lines divide I_, or Jy (2 = 1,2,..., 7+ 1) into Jy 1,1 and J_1,2, or into 
In. and I 2, respectively, where I-12 and Jy, are adjacent. Set Ix,-1+ 
+In1 = Ih. As f(I) € SA, 


Ff (Lox) + f (Ze) < f(Tee-i.1) + f(t) + f(T) + A; 
A = f(Iee-1.2) + f(Iee.1) — f(T) > 0 i’ > it, i” 41), 
since f(J) is infinitesimally additive. Proceeding in this way we replace S(R) 


by a S’(R) such that the variation at each of the lines producing S’(R) is 
zero and that 


f(S) < f(S) +; 


which completes the proof. 


Clearly (iii’) is weaker than the condition that f(J) be continuous; in 
Eu, (n > 2) however, (iii’) is not a necessary condition either. Consider the 
square 0 < x, < 10 < x2 <1 and take f(J) = 0 except in the following 
cases. 

















ON THE BURKILL INTEGRAL 119 


(a) One side of J is formed by a segment, of length 1(0 < 1 < 1) say, of the 
line x; = 4. Then take f(J) = 1. 


(b) Part of the line x, = } is contained in the interior of J, and the total 
length / of the closed segment concerned is < 1. Then f(J) = 2. 
Plainly f(J) € SA, f(J) is integrable, and Saf = 2. Yet f(J) is not 
infinitesimally additive. Take 7* as the segment 0 < x: < 1 of x; = 4. Then 
t= 1, f(h) =f) = 1, (il, = *, I = 11+12) 
while f(7) = 0. Thus the term 


fh) + fU2) — f(D = 2 


and does not tend to zero. 


5. Continuity of the indefinite integral. We deduce 


THEOREM 2. Suppose that f(I) is Burkill integrable. 

(a) Then F(I) = Sif is continuous on R if and only if, given « > O, there 
are numbers 5n > 0 such that |>-f(J,)| < «€ whenever |I| < 6, I = J, and 
|| Zel| < 0. 

(b) The continuity of f(I) is necessary and sufficient for that of F(I) (i) in 
Eu, (ii) when 

\f(D)| < Df)! (J = Lh), 


for instance when f(I) increases by subdivision and is non-negative. 


The statement (a) is deduced from the inequality 








|x f(h)| - | FD | | <4e (max ||7,|| < 1) 
k 

which follows from Lemma 1. Since continuity is well-known to be a sufficient 

condition (3, p. 167), (bii) is now evident. In Eu;, we have |J| = ||J||. Hence 


it is necessary that 
| f(D)| < « for |J| < min (6,y). 
Thus f(J) is continuous. 
Note that in Eu,, nm > 1, the condition in (a) does not imply continuity of 
f(). Take n = 2; R as the square 
0O<x,< 1; f(D =|14+P 


when J touches the line x; = 1 along a segment of length /, f(7) = |J| other- 
wise. Clearly f(J) is mot continuous, while F = Saf exists; F(J) = |J|, which 
is continuous. 


6. A rectifiable curve. The curve C{x,(t), xo(t),...,xm(t)} in Eu, is 
defined by functions x,(t) of bounded variation over 


R= (,a), j =1,2,...,m,x, (0 —) = x,(0), x(a +) = x,(a). 











120 H. KOBER 


Its arc length Ao, is 
m ; 
Ave = Lu.bd. 2) F(I); FU) = 1 > (e,(D)" » do I, = R,max|J,| > 0, 


where x,(J) = x;(t2) — x,(t:) for J = (t:,te) C R. If all the x,(t) are continuous 
then not only the upper bound, but also the proper limit of > F(J,), that is 
SeF(D, is known to exist. We deduce 


THEOREM 3. Given a curve C{x;(t),..., X_,(t)}, the Burkill integral feF(D 
exists if, and only if, C is normal. This holds for the generalised form of F(I) as 
deined below. 


Definition 1. A curve C({x:(t),..., Xm(t)} is normal if all x,(t) € V and if, 
for any ¢ € R°, there is a p,(O < p; < 1) such that 


6.1 x(t) = px,(t-—-) + (1 — p)x,;(t +), j= > ae 7 


that is, if any point associated with ¢ lies on the line segment joining the two 
points x,(¢t —), x,(¢+) GG=1,..., m); which clearly coincide when all the 
x,(t) are continuous at ¢. 


Definition 2 (Generalisation of the arc length). Let the function 
f(Yu¥a, ~~ ++ ¥m)(O Ss < @) 
be (i) non-negative, (ii) continuous, (iii) strictly increasing concerning each 


yy, (iv) homogeneous of degree one and (v) subadditive, i.e. 


0<¥,< “ 
FM HF 21, - = + Ym HF Bm) SS(H1y + + + Ym) + SCS, - - - 5 Bm) eS <o@ 


and such that there is equality only if the z, and y, are effectively proportional 
(that is, for some finite ¢ > 0, y, = oz, or z,; = oy,). Clearly 


F(I) = f(\xi(D)|,.--, |xm(D|) € sA 


as the x,(J) are additive; and the generalised arc length is defined as the upper 
Burkill integral UF. We obtain the ordinary arc length when 


m l/p 
f0u.--9m) = (E y?) ’ p = 2, 
j= 
but f satisfies the above conditions also for 1 < p < © by Minkowski’s in- 


equality (1, §2.11). So does, for instance, the function 


f(n1, ¥2) = (yi. + kyvy2 + y2)', e<¢ée<¢2 


By (iv), f =0 for y: = ye =... = ¥m = 0, while f > 0 otherwise by 
(iii). Again by (iii) and (iv), 


f <f(il,..., 1) max y,,f > yif(1,0,...,0), f > y2f(0,1,0,...,0),.... 
Thus 








I 








ON THE BURKILL INTEGRAL 121 


6.2 p> lx,(2{f,...,1) > FU) >ec >> lx,(Z)|;¢ = min{f(1,0,...), 


FeGAS, 2.2 Decne fe 
so that UF < @ if and only if all x,(#) € V. 
Suppose now that 6.1 be satisfied. Take any point t = i € R®, 


I, = (tit), Io = (tte), I = (tite) C Ry; || 3 0. 
6.3 F(I,) + FU) — F(D) f(r, ~~~, [tml) + f(loal, «~~, fom!) 


where u, = x,(t) — x;(¢t —), 0; = x(t +) — x,(t), wy = x(t +) — x,(t —-), 
and by 6.1, wu; = (1 — p,)wy, vy = p, w;. By the homogeneity of f the ex- 
pression on the right in 6.3 vanishes. Hence F(J) is infinitesimally additive, 
therefore { 2F (J) exists. 

Conversely suppose that F(J) is Burkill integrable. Then it must be in- 
finitesimally additive. By 6.3, therefore, 


6.4 ff (wal, ... , [wml) = f(lamal, . ~~, [teml) 4 f(joal, . .~ » lami). 


Now |w,| < |u,| + |v,|. By (iii) and (v), therefore, 6.4 remains true when 
|w,| is replaced by |u,| + |v,|. Hence 








0; 


6.5 F{(\s| + Jorl), (lee] + Joel), ...} = f(laal,...) + f(joil,...). 


Any u, and v, have equal signs; for if uw, were negative for some j, |w,| would 
be <|u,| + |v,|, and since f is strictly monotone, 6.4 and 6.5 would contradict 
each other. Hence uw, > 0. By (v), 6.5 implies that the |u,| and |p,| be 
effectively proportional. Thus for some ¢ > 0, depending on ¢ only, », 
= ou,(j = 1,2,...,m) or u, = ov, Taking p, = (1 + ¢)“ or (1 + ¢)™', 
respectively, we arrive at 6.1. This completes the proof. 


Remark 1. Clearly (iv) and (v) imply that f is a convex function. 
For n <1, U;,F (x(DeV, IC IoC Eu,) is a lower semi-continuous functional. 


Remark 2. There are applications of our main theorem to the areas of 
surfaces z = f(x,y) which are not continuous or even nowhere continuous. 


REFERENCES 


1. G. H. Hardy, J. E. Littlewood and G. Pélya, Inequalities (Cambridge, 1934). 
2. L. A. Ringenberg, The theory of the Burkill integral, Duke Math. J., 15 (1948), 239-270. 
3. S. Saks, Theory of the Integral (2nd revised ed., New York). 


Birmingham, England 








ON GENERALIZED AVERAGING OPERATORS 
R. P. BOAS, JR. 


1. Introduction. Let 
Val(z) = 3[f(z + h) + f(2)). 


Sumner (4) has discussed V/,*f(z) for arbitrary real \ and h, where f(z) is an 
entire function of exponential type r < 2/\h|. I shall show that in this case an 
alternative definition of Y7,*, which leads to Sumner’s results more quickly, 
is equivalent to Sumner’s. (However, Sumner’s definition is, in principle, 
applicable to a wider class of functions.) I also make some remarks on Sumner’s 
question about the existence of nontrivial solutions of V,*f = 0. In particular, 
I shall show that in a certain sense there are such solutions for every positive A. 


2. Definitions. Let f(z) be an entire function of exponential type r, let S 
be its conjugate indicator diagram, and let F(w) be the Borel-Laplace trans- 
form of f. (For terminology, see, for example, (1).) Then we have the Pélya 
representation 


(1) f(z) = (mi) J F(w)e*dw, 


where C is a contour surrounding S; since S is a subset of the disk |w| < 7, C 
can in particular be the circumference |w| = +r + «, « > 0. 

Let @(w) be regular on S, hence on contours C which are sufficiently close to 
the boundary of S. If D stands for d/dz, we can define the operator ¢(D) by 


(2) ¢(D)f(2) = (mi) { F(w)o(w)e™dw, 


where the definition is independent of the particular contour C that is employed 
provided that we admit only contours lying within the domain of regularity of 
¢. If K[S] denotes the class of entire functions of exponential type whose 
conjugate indicator diagrams are subsets of S, the operator ¢(D) applies to all 
elements of K[S] and transforms them into elements of K[S]. (If ¢ has poles in 
S we can still define ¢(D) by (2) but we get, in general, different values for 
¢(D)f according to which C we take, if there are poles of ¢ not in the conjugate 
indicator diagram of f.) 
If y is regular over the range of ¢(w) for w in S, we have 


VoD) = @xiy* f Fwvlo(w)le™aw, 


Received June 5, 1957. 
122 


? 




















GENERALIZED AVERAGING OPERATORS 123 


so in particular 
(3) [s(D)IY = xi" f F(w)lo(w)P aw 


for any positive integral A. If ¢(w) is zero-free on S, (3) holds for arbitrary 
(real or imaginary) A; we must of course specify which regular branch of 
[¢(w)]}* is to be taken. If, for example, [¢(w)} = & "**™ with the same 
branch of log ¢(w) for all X, it is then immediate that 


(4) [o(D)I{lo(D) I F(e)} = [o(D)P* fe). 
As a special case, we see that [¢(D)]~ inverts [¢(D)}. 
For example, if ¢ is regular and zero-free in a region containing w = a, we 
have 
(5) [o(D)} e* = [o(a)}* &*. 


An alternative representation is obtained by expanding ¢(w)e**-® in 
powers of w: 


o(w)e"*” - > wp, (2 = t). 


The p,(z) are the Appell polynomials (3) generated by ¢(w), and (2) can be 
written 


(6) oD) = Lf? Opals - 0, 


where the series converges if S is inside the largest circle of regularity of ¢(w) 
with center at 0, and is Mittag-Leffler summable (2, p. 79) if S is inside the 
Mittag-Leffler star of @ with respect to 0. In particular, if ¢ = z and 


¢(w) = > cw", 


(6) becomes 
$(D)f(z) = a ef (2) = > caD"f(2), 


which is a very natural interpretation of ¢(D)f. 

In Sumner’s case ¢(D) = VY, = 3(e"’ + 1), so that o(w) = $(e*” + 1). 
If h# is any non-zero complex number, the zeros of @ closest to 0 are at 
w = + inr/h, so that (3) is valid (in the sense of transforming an element of 
K[S] into an element of K[S]) if f is an entire function of exponential type less 
than x/|h|, and more generally if S avoids the points (2k + 1)ix/h. For 
example, (3) defines V7, for all if 4 is real and f(z) is an exponential sum 


7 
> a,” 


j=0 
with real b,. 


If we choose log {4(e"” + 1)} so that it reduces to 0 at w = 0, we then have 
lim Vi f(z) = f(2), 
a0 








124 R. P. BOAS, JR. 


uniformly on compact sets, if f is an entire function of exponential type (the 
left-hand side being defined when |h| is sufficiently small). We also have 


lim Vi. f(z) = f(z) 
A0 
when f(z) is of exponential type less than 1/|A|. 


3. Equivalence of the definitions. We now compare our definition of 
V* with Sumner’s. Since both versions satisfy (4), and coincide for integral X, 
and Sumner’s was given only for real A, it is enough to consider the range 
0 <A < 1; for convenience we take h > 0. 

For this range, Sumner’s definition is 


. = a . a = ; 
(S) Vaf (2) a 2*T(1 fe h)-29i a {f(z hw) +f(z+h hw)} 
XI(w)T(1 — A — w)dw, 
where 0 < 6 < 1 — d and f(z) is of exponential type less than +/h. Represent- 
ing f(z) by (1), we have 


" . - 1 > 0+ too — = — 
(7) (S)V,f(z) = PTC = ne r(w)T(1 A — w)dw 





x f (e** + eftt™ ty et P(t) dt. 
c 


If we change the order of integration, this becomes 


1 g . zt (2+A) t 
(S) Tif(z) = Ta — r) Qrip? | F(t) (e7' + e&**” ")dt 
— — j v¥7CcC 


0 b+ tx 
x | e””'T(w)T(1 — \ — w)dw 
b— ia 
1 - . . 
= xonJ e(1+e")F) (1+ &'y dt, 
‘“ c 
which coincides with (3) when ¢(w) = }(e*”+'). Here we have used a definite 
integral quoted by Sumner (4, p. 438). 
It remains to justify the change of order of integration. It is sufficient to 


verify that the iterated integral is absolutely convergent. Now, for fixed z and h, 
F(t) (e** + e* 5 is bounded on C, so it is enough to show that 


b+ too 
(8) f leo"'T'(w) (1 — X — w)dw| 
b— feo 
converges uniformly for ¢ on C, where 0 < 6 < 1 — \. We can take C to be 
a circumference |t} = (x/h) — ¢«, € > 0, so that 

le**| < expthlyll(#/h) — 4} = O18") 
with ¢ < x and independent of t. Now we have 


mcsc r(1 — A — w) 

















eT 














GENERALIZED AVERAGING OPERATORS 125 


so that 


(wll — A — w)| < zlesc (1 — A — w)| |T(w)/TA + w)|. 
Since csc (1 — A — w) = O(e7*!"!) and 
P(w)/T(A + w) = O(\w|) = O(ly!) 


as w = 6 + iy— o, the integrand in (8) is O(e~*!”'!) with 6 > 0 and inde- 
pendent of ¢. Hence (8) does converge uniformly for ¢t on C and the change of 
order of integration in (7) is legitimate. 


4. The existence of eigenvalues. Sumner noted that there are nontrivial 
solutions of V,*f(z) = 0 which are entire functions of exponential type, 
although not of type less than x/|h|, when \ is a positive integer. It is easy to 
show that there are no nontrivial solutions which are of exponential type less 
than 2/|h|. More generally, if ¢(D)f(z) is defined by (2), where ¢(w) is regular 
on Sand F(w) is regular outside S, we have, if ¢(D)f(z) = 0, 


J F(w) o(w)e™"dw = 0. 
c 


This implies, by a lemma of Pélya’s (1, p. 110), that F(w)¢(w) is regular inside 
C. If ¢(w) has no zeros inside C, F(w) must be regular inside C, and so every- 
where; hence, since F vanishes at ~, F(w) = Oand sof(z) = 0. 

On the other hand, if ¢(w) has a zero at w = a € S, (2) shows that ¢(D) 
(and all its positive integral powers) annul the function f whose Borel-Laplace 
transform F is F(w) = 1/(w — a), namely f(z) = e**. Moreover, [¢(D)|? also 
annuls the inverse Laplace transform of 1/(w — a)?, namely f(z) = ze**; and 
so on. 

In particular, V,*f(z) = 0 for positive integral \ if f(z) = e***” (cf. (5)), or 
sin(rz/h) or cos(xz/h); V,f(z) = 0 for A = 2,3,...if f(z) = 2 sin(xz/h) or 
z cos(xz/h); and so on. 

Sumner raised the question of whether there is a set of values of \ which are 
eigenvalues for V7,* in the sense that for these, and only for these, there are 
nontrivial solutions (eigenfunctions) of V,*f(z) = 0. We have, of course, 
defined V,*f(z), when A is not a positive integer, only when f is of exponential 
type less than x/|h|, but it would be natural to extend the definition as follows. 
If |k| < |h|, and f is of exponential type 2/|h|, Vf(z) is defined and we take 


(9) Vif(z) = lim Vif(z) (|k| < |h)), 
—h 
if the limit exists. We may, of course, lose property (4) with the extended 


definition. 
Now by (5) we have, if |k| < |hl, 


Vv; eee - { (er + 1)}* extrem : 


and if R(A) > 0, this approaches 0 as k — h. Thus if V,* is defined by (9), 
every \ of positive real part is an eigenvalue and e***” are eigenfunctions, 











126 R. P. BOAS, JR. 


just as they were for positive integral 4. Furthermore, ze+*” are also eigen- 
functions when (A) > 1, and so on. 

It is interesting to note that there are still other eigenfunctions. To see 
this, suppose that the Borel-Laplace transform F(w) of f(z) is such that 
F(w)(w + ix/k)* is uniformly dominated by an integrable function in an 
annulus 2/|hk| < |w| <a. We can then shrink the contour C in (2) to 
|w| = 2/|h| and obtain 


Vise) = rit | Fw) tae + 1) Pe dw; 


¢Y |wl|—r/\h| 


then we let k — h, and obtain finally 
(10) Vif(x) = (2xi)™ J F(w){4(e"” + 1)}*e™ dw, 
wi=r/\h 


where the branch of the power is one that is regular in the plane cut from 
+ir/hto ~. 

Now consider (for the sake of simplicity) positive values of 4, and apply 
(10) to f(z) = Jo(xz/h). Then F(w) = (w? + 2°/h?)-* for |w| > hk, and 
satisfies the hypotheses required for (10) if \ = 4. Thus 


(11) Vihf(e) = (2xi)™* f | fw" + °/h*}*{h(e™ + 1)}*e™dw. 
|w| = /h 

Here the first square root is defined in the plane cut from ix/h to — in/h. 

However, continuing either square root around + ix/h replaces it by its 

negative, so the integrand in (11) is regular in the closed disk |w| < 2/h. 

Hence the integral in (11) is zero. Thus Jo(rz/h) is still another eigenfunction 

corresponding to A = }. 


REFERENCES 


1. R. P. Boas, Jr., Entire functions (New York, 1954). 

2. G. H. Hardy, Divergent Series (Oxford, 1949). 

3. I. M. Sheffer, Some applications of certain polynomial classes, Bull. Amer. Math. Soc., 47 
(1941), 885-898. 

4. D. B. Sumner, A generalized averaging operator, Can. J. Math., 8 (1956), 437-446. 


Northwestern University 
Evanston, Illinois 











hah epee rmelC lM ~~ = *& AF OD = 


ao ehe oh of 








MIXED PROBLEMS FOR LINEAR SYSTEMS OF 
FIRST ORDER EQUATIONS 


G. F. D. DUFF 


Introduction. A mixed problem in the theory of partial differential 
equations is an auxiliary data problem wherein conditions are assigned on two 
distinct surfaces having an intersection of lower dimension. Such problems 
have usually been formulated in connection with hyperbolic differential 
equations, with initial and boundary conditions prescribed. In this paper a 
study is made of the conditions appropriate to a system of R linear partial 
differential equations of first order, in R dependent and N independent 
variables. That such a system can be used to study a single linear equation of 
higher order, with one dependent variable, will be demonstrated in a later 
paper. 

The method of analytical power series will be used, and applied to certain 
non-analytic problems by approximation procedures. However this study 
primarily reveals that a large class of analytic equations and systems can be 
treated in connection with mixed problems, and that the behaviour of solutions 
near the intersection of the two surfaces can be determined. 

Mixed problems for normal hyperbolic linear equations of the second order 
have been treated by Krzyzanski and Schauder (8), by Ladyzhenskaya (9), 
and in (3; 10; 11; 12). Systems of first order equations, restricted to the hyper- 
bolic type, have been studied in the case of two independent variables by 
Campbell and Robinson (2) who establish mixed boundary and initial con- 
ditions by a Picard iteration process. The analytic systems treated here are not 
necessarily of hyperbolic type, although the existence of at least some charac- 
teristic surfaces is assumed. 

The solution of a mixed problem for a hyperbolic equation or system is 
defined on a domain which is split up into two or more portions by certain 
characteristic surfaces. A reduction to standard form of such problems may be 
achieved by subtracting out a solution of a pure initial value problem with the 
given initial data. As the solution of the latter problem can be regarded as 
known (10) we shall employ this device. Thus the mixed problem is reduced 
to a problem wherein some of the data are given on a characteristic surface. 

This fact has significance in a different connection. The basic existence 
theorem for analytic partial differential equations, the Cauchy-Kowalewsky 
theorem, contains the requirement that the system treated should be written 
in normal form (4). This amounts to the condition that the datum surface be 
non-characteristic. Thus the above reduction of a mixed problem leads to an 


Received March 18, 1957. 
127 











128 G. F. D. DUFF 


exceptional case of the Cauchy-Kowalewsky theorem. It is from this stand- 
point that we shall treat mixed problems involving a single characteristic 
surface, and consequently the theorem also constitutes a supplement to the 
Cauchy-Kowalewsky theorem. We remark that only for linear equations is a 
characteristic surface defined independently of solutions of the equations. For 
non-linear problems the data determine the characteristic surfaces, and a more 
complex situation arises. 

Here is an outline of the detailed results. We first consider a characteristic 
surface of multiplicity ~ and show that conditions given on a second surface 
are appropriate for an analytic solution. A certain algebraic condition appears 
in this result, and in the second part we study the case in which this condition 
fails and non-simple elementary divisors appear in certain coefficient matrices. 
It is shown that this case is not appropriate to a mixed problem. 

The general problem in the analytic case, of an arbitrary number of charac- 
teristic surfaces, is then treated, with the assumption that each characteristic 
surface has simple elementary divisors. Series expansions for the solution 
functions are found in each of the several regions defined by the characteristic 
surfaces. Finally an extension of these results to the non-analytic case is made 
for symmetric hyperbolic systems, for which estimates of the Friedrichs-Lewy 
type are known (6; 7). 


1. The linear system. Consider the system of R linear partial differential 
equations of first order 


ct ik eee 2 


1 Ou, - 
(1.1) Ors at + bru, = fr S@ Bik scat 


in'R dependent variables u, and N independent variables x‘. Here summation 
over the repeated indices s and i is understood. Since any linear system of 
partial differential equations can be reduced to a system of first order equations 
by taking suitable partial derivatives as new variables, (1.1) has considerable 
generality. We assume for the present that all coefficients, functions and solu- 
tions are real analytic functions of the variables x‘. 

With matrix notation A‘ = (a‘,,) for the coefficients and vector notation u 
for the unknowns we can write (1.1) as 


(1.2) Eu = A‘—, + Bu =f. 


We note that the array a‘,, of the leading coefficients has two matrix indices 
r and s which will transform affinely under linear transformations of the 
dependent variables u,, and one coordinate index i which can be taken to be 
contravariant under functional transformations of the coordinates x‘. 

To apply the theorem of Cauchy and Kowalewsky to the system we choose 
a surface S: ¢(x‘) = 0 such that (1.1) can be written in normal form relative 
to S (4, p. 56). Thus if ¢ = ¢(x‘), the normal form is 





re 
tl 


DD 0 2d 


il 


1b 








MIXED PROBLEMS FOR LINEAR SYSTEMS 129 


0 
(1.3) Gr = Ew t+ fi, 


where derivatives with respect to ¢ appear on the left only so that E; contains 
only differentiations with respect to N — 1 other coordinates. The theorem 
then asserts the existence of a unique analytic solution of (1.1) which assumes 
given analytic values on S. Thus R initial conditions are assigned. The condition 
of solvability for the transverse derivatives is readily computed and is found 
to be that the determinant 


| 3 | 
a args | 


should not vanish (14, p. 30). Here a contraction over i is understood. 
Let us consider the case when the determinant (1.4) does vanish on a surface 
G: o(x*) = 0. Then the surface is, by definition, characteristic. If the matrix 


A‘ o¢ 


(1.5) ax! 
has rank R — yu, then yu will be called the multiplicity of G as a characteristic 
surface (1, p. 268). If G has multiplicity yu, it is possible to solve (1.1) for R—y 
only of the derivatives du,/dt. We thus find R — yz equations of the form 

Ou; 


(1.6) op = Esluk) +=1,...,R—4g, 


together with » further equations 
(1.7) Lj(u,) = 0 j=R-utl,...,R, 


which contain no derivatives with respect to ¢. These latter are “inner” 
relations on G as they involve only the values and tangential derivatives of 
the u, on G. Therefore they constitute necessary conditions for any set of 
values of the u, on G. Thus it is to be expected that R — yu suitably chosen 
components u, will determine on G the values of the remaining components, 
and so R — u “‘initial’’ conditions are appropriate for G. 

However, since (1.7) are differential equations on G, an equal number of 
initial conditions for them will be needed to determine uniquely all the initial 
values. We shall assign uw further conditions, subject to certain restrictions 
which will be stated below, on a second surface JT: ¥(x‘) = 0 which we now 
introduce. Let T be not characteristic, and let G and T intersect in an edge 
C of N — 2 dimensions, which also will be analytic. We remark that if C is 
given then G may be determined as a characteristic surface passing through C, 
and composed of the characteristic curves of the characteristic equation 


= (), 





: ‘ Og 
(1.8) | A ax? 


according to the theory of a single partial differential equation of the first 
order. These curves are the bicharacteristics of our system. Indeed we may 











130 G. F. D. DUFF 


suppose that ¢(x‘) is constructed as a solution of (1.8), so that the family of 
surfaces (x*) = const. are all characteristic. 


2. Reduction to canonical form relative to a characteristic surface. In 
order to construct the solution described above, and to specify in detail the 
necessary conditions, we reduce the system (1.1) to an appropriate standard 
form relative to this problem. Let G and T have equations ¢(x‘) = 0, y(x‘) =0 
respectively, and set 


(2.1) t=- x/% -_ o(x'), c= /*-! on v(x") 


in a suitable new coordinate system x’‘. Dropping the primes we now let 
Greek indices p,o run from 1 to N — 2 over the remaining coordinates 
s....,8°". 

Since @(x*) = x”, the matrix 


= (0) ~ (01,24) 
Ts rs ax" 
now has rank N — u. Let y,“ and z, denote the u linearly independent left 
hand and right hand null vectors of this matrix. Thus 
(2.2) yay, =0, a2” =0, a=1,..., yu. 


Multiplying (1.1) on the left by y,“ we find 


a) y Ou, , a) Ou, 
0 == 5 a>, 7 —_ Vr - -— “* 
y ax” > , * Ox” + 
which can be rewritten in the form 
+) OU  « Ou 
2.3 © ght — = — y, a,—— +.... 
( ) ¥y s ox > ¢ ax’ v 


These » equations are independent of derivatives with respect to ¢. 
There are now R — yu equations containing derivatives with respect to ?; 
we may write these in the form 
2 ( - ) hat p OU, 
—\a;,u,) = — ars F cess 
at \tre Me) = — De ae aes 
where r varies from 1 to R. However as a”,, has rank R — yu, only that number 
of linearly independent combinations of the form 


(2.4) v7, = a, us 


are generated. Let us number these independent combinations from 1 to R — uy, 
and write the above equations as 


Ov, _ Ou, ol 
(2.5) a %* Ox + L,(u;), J) R—u. 


The combinations v, shall be called normal variables with respect to G. Here 





tio 


ar 


fo 


H 





om o SB 


re 





MIXED PROBLEMS FOR LINEAR SYSTEMS 131 


all the dependent variables are still denoted by u,, a further u linear combina- 
tions of the u, remaining to be specified. L,(u,) is a linear first order differential 
operator in the variables x*(p = 1,..., N — 2). 

If we now define 





(2.6) Wratt = 91 an, ts, a=1,...,6 
we can write (2.3) in the form 
(2.7) wt m= L(t) paR—p+1,...,2 


The linear combinations (2.6) shall be called null with respect to G, or more 
briefly, null. However we must show that the null quantities (2.6) are linearly 
independent of each other, and of the variables in (2.4), so that the combined 
new system (2.5) with (2.7) is equivalent to (1.1), being obtained from (1.1) 
by a non-singular linear transformation of the uw, into the v, and w,, together 
with linear combinations of a non-singular nature of the member equations of 
the system. 

In the first place the combinations w, (r = R— w+ 1,...,R) of (2.6) 
are linearly independent. For otherwise, we should infer from a dependence 

> Ca WR-a+1 = Zz Ca y — uu, = 0 


for all u,, the relations 
Dd cay” ay, = 0. 


However, since 7 was assumed non-characteristic, the matrix 
(a?,* 
is non-singular. It follows that 


> cays” = 0 


which implies c, = 0 as the null vectors y,™ are linearly independent. 
We now ascertain the condition that the combined set (2.4) and (2.6) should 
be linearly independent. A linear relation of dependence takes the form 


R-s a 
N (a) _N-1 —_ 
> Cr Ars Us + . Cr—a+1 Yr a;, &s = 0, 
r=1 a=1 


and if this holds for all u, we have 


R-« oe 
(2.8) >, Orbe > Coetih’ &. = 0. 
r= a=! 


If a non-vanishing set of constants satisfies these equations, not all of either 
group may be zero. This has just been shown for the first group. For the second 
group, we refer to the definition (2.4) of the first group of the w, and note that 
by a relabelling of columns we may assume that the (R — yu) X (R — up) 











132 G. F. D. DUFF 


determinant | a”,,| (r,s = 1,...,R— 4) is not zero. Thus if the second 
term in (2.8) is assigned, the c, in the first sum are determined by the R — u 
equations with s = 1,...,R— up. 

Now the full determinant of (2.8) is 


N N (l)_N-1 (as) N-1 
Qi1,-+--++, Appts Vm Qmi »+-+- + Vm Ami | 
(2.9) salt my & 
N N (1) _ N—1 () N-1 
Gir; ee | Aru. Rm OmR»+++3 Ym OmR 


Since the yz right null vectors z,@ of (@”,,) are independent, we can multiply 
the sth row by z, and subtract these multiples from the bottom 4 rows in 
such a way as to make the R — uw X yu block in the lower left corner vanish. 
This can be effected by a choice of basis for the z,@ such that 


Ze ure = Sap (a,8 =1,..., n). 


Such a choice of basis is possible since the z,@ are independent and since 
z, = 0,s > R — yu implies 


R-s 
> a2,” =0, g=zl,...,.R—p 
s=1 


which in turn implies z,@ = 0 as the determinant of this set of equations has 
been chosen different from zero. Thus if we multiply the sth row by z, and 
add these multiples to the R — u + 6 row, the lower left hand R — uw Xu 
block will vanish. The determinant thus becomes the product 


N r (a) N-1 (8) | 
(2.10) lose |- 190 Gun Se | 


where r,s=1,...,R—p;m,2 =1,...,R, and a,8 =1,...,p. The first 
factor is not zero. The null vectors y,,@ and z, depend on the indices m and 
n in such a way that the combination in the second determinant is invariant 
under linear transformations of the u’s. Since 


in a general coordinate system, we see that the general invariant condition for 
the non-vanishing of (2.9) is that the determinant 


a 0 
(2.11) Ym” ain oy a” | #0. 
Here a,8 =1,...,0;421,...,N, and m2=1,...,R. 


We now show that this condition is satisfied in the case when G has multi- 
plicity one, provided that the edge C = G/\T is nowhere tangent to the 
bicharacteristic direction on G. If we define the contravariant vector 


(2.11) h* = Ym Onn 2n (a = B = 1), 


then (2.10) implies 
(2.12) h oy #0. 








wl 


TI 





it 





MIXED PROBLEMS FOR LINEAR SYSTEMS 133 


The result will be established if we can show that A‘ is parallel to the bi- 
characteristic direction defined by 
dx* » " , 
(2.13) r 3 = F,,, F = |a,, p;|. 
We assume that not all of the F,; vanish, so that the bicharacteristic direction 
is well defined. On G we have p,; = 0,1 < N, and py = 1 since the equation of 
Gist=x* =0. 
Now 
(2.14) F,, 


Dd a M,,(p), 


where M,,(p) is the cofactor of the determinant F with respect to the r,s 
position. With p; = dy, we find 





(2.15) F,, = >> at, my, 

where now m,, is the cofactor in |a”,,|. Since 

(2.16) DD 2m, = 5,,|ar| = 0 
t 


we see that for every s, m,:,) is a null vector on the left for the matrix a” ,, of 
rank R — 1. Thus m,(,) = y,(s) in our previous notation for the null vectors, 
where, however, the bracketed index is inactive. Similarly, for any 7, my,,), is 
a null vector of z,:,) on the right. Since a*,, has rank R — 1 the minors m,, 
are not all zero; thus suppose m,, ~ 0. Since there is only one independent 
null vector, we have 

Ms, = CsMr, 


where ¢, is independent of r. Setting r = a, we find 


Mas 
¢°F=— 
Mad 
whence 
Mas M rp 
",, = —i— 
Mar 
Thus 
f Mas M ry 
F,, = Zz rs 
- r.8 Mar 
(2.17 1 
i 
ee ae Mry Ars Mas, 
Mar 1.8 


and since m,» = kyy;, Ma, = kat,, where ky, &, depend only on the indicated 
suffixes, and are each different from zero since m,, # 0, we find 


(2.18) F,; - ky ie > Yr a}, 2, = 
Mar 18 ab 


This proves that the vector h‘ is parallel to the direction of the bicharacteristic 
displacement, as required. 











134 G. F. D. DUFF 


An example of a case where this condition (2.12) holds for a simple character- 
istic surface (u = 1) is when the system (1.1) is symmetric, so that a‘,, = a‘,,, 
and when T is spacelike, which means, in effect, that a*—',, is positive definite 
(6). The condition (2.12) is satisfied since z, = y, and 


¥ s, = of" y,9,>0. 


. ae 
(2.19) h Vr Ore 


ax* 

The auxiliary conditions to be applied on G and 7 will now be formulated, 

and will be referred to as initial and boundary conditions, respectively. For 

initial conditions (on G) we assign R — yu linear and independent combinations 
of the variables u, in the form 


(2.20) eu, = g, A=1,...,R—a4. 


An algebraic restriction necessary for our theorem is that it should be possible 
to solve these conditions for the normal group (2.4) of transformed variables 
v,(r = 1,...,R — w). From (2.4) and the non-vanishing of |a”,,| (r,s = 1, 
.,R — p), we have 
a, = A” ..0 +... 


where A” ,, denotes an inverse matrix; and we shall therefore be able to solve 
(2.20) as required if on G the determinant 


(2.21) |, A”,,| ¥ 0, Ars=1,...,R—4u4, 
which we now assume. 

The form of the u additional boundary conditions to be imposed on T is also 
linear: 
(2.22) G;%,=g yv=R—ptl,...,R. 


We require that these equations be solvable for the null variables w, of (2.7). 
To determine the condition necessary for this, we note that if (2.22) are not 
thus solvable, there will exist a linear dependence among the R — u unknowns 
of the group (2.4) and the left side of (2.22) of the form 


R-u R 
(2.23) > a,a",,u,+ > c,a,u, = 0. 
r=1 v=R—w+1 


Since the u, are arbitrary, we find 


R-« R 
(2.24) > aa + DY ca, =0. 
r=1 v=R—p+1 


We shall require all solutions (a,, c,) of these linear homogeneous equations to 
vanish, and thus assume that the determinant 


(2.25) 5 SN Fee 


Here the suffix s labels the R rows of the array. Multiplying the rows by com- 
ponents z, of the right-hand null vectors as in (2.9), we can cause the lower 








iT? 


ot 
1s 


to 


er 





MIXED PROBLEMS FOR LINEAR SYSTEMS 135 


left hand block of R — mu X uw terms to disappear. The determinant splits into 
two factors: 


| a’, | OMG, «2 2% R-s* | a, 2” | A, pee R—p+l, ..., R 


of which the first is not zero. The necessary condition is therefore the non- 
vanishing of the determinant of order R — yz: 


(2.26) lays” | #0. 


Here s is summed from 1 to R in each element, and this condition will apply 
on the surface 7. 

On the edge of intersection C both of these sets of conditions should apply. 
We assume that taken together (2.20) and (2.22) shall determine the values of 
all R dependent variables uniquely on C. The compatibility as well as the 
uniqueness of such values will be assured if we suppose that the R KX R 
determinant 


(2.27) Lae cancel tes. ieee 


3. Construction of the solution. The preceding calculations lead to a 
standard form for the differential system, and for the auxiliary conditions, 
which we shall now employ. The differential equations are a group in normal 
form 
Ov, 


(3.1) = 


av, 
= ry + Les, Ws) + fr, PS hee isk Oe 


and a group in which derivatives with respect to x of the null variables appear: 


Ow, 
(3.2) _* L,(v,, Ws) + fr, r=R—ptl,..., RB. 
The operators L,(v,,w,) of the first order contain no differentiations with 
respect to ¢ or to x. The initial conditions are given by values of the normal 
group as linear combinations of the null group: 


g=zl...,R—p 
— = Cath TE N= R—-wtl,..., R. 
These hold for ? = 0. The boundary conditions on T are of the opposite type: 
‘ A=R—yuwtl,..., R 

(3.4) Wy = Oh, 0, + Br ry eae i 


and this seems to prevent the ordering used by Riquier (16). 

Now (3.2), (3.3) and (3.4) enable us to determine initial values for all of the 
w, on G. Consider the null group of differential equations on the NV — 1 dimen- 
sional surface G and note that relative to the edge C and the set w, of unknowns 
they are in Cauchy-Kowalewsky normal form. The », are to be replaced by 
their values (3.3) on the right side of (3.2), so that a self-contained system 
for the w, is established. On the edge C the initial values for the w, are assigned, 
and it follows from the Cauchy-Kowalewsky theorem that a unique analytic 











136 G. F. D. DUFF 


system of values for the w, on G exists. This process determines initial values 
on G for all of the variables. 
Let the unknowns now be expanded in a series of powers of ¢: thus 


(3.5) wv, = ; Wrin) (x, Xp) t". 
n=0 


The coefficients, functions, and operators appearing in the differential equations 
or auxiliary conditions shall also be expanded in power series of t, the coefficient 
of # of such a function f, being denoted by f,,). We have to show that the 
coefficients in (3.5) can be determined recursively. 

Suppose known all coefficients of index less than m + 1, and let us calculate 
the w,(n41). To do this, let us expand (3.1), (3.2) and (3.4) in powers of ¢ and 
equate coefficients of equal powers on both sides. It is found that 





n av, n 
(3.6) 1 UVrin+1) = u Qrs(n—v) 3 + y Liew» (v,, w,) + frm) 


re) 
gy=zl....,R—s, 
Ow, n a 
(3.7) — = L,@ (Vinten) + Lo (Wainev) 
n—l 
+ > Lin») (Va); Ws(»)) + frintd 
p=) 
r=R—p2p+1,...,R, 
and 
n+1 
(3.8) Wra+) = u Cra(n—») War») + Zrin+1)s x = 0. 


Here L and L denote the terms in L which contain the v, and the w, 
respectively. 

Assuming known all coefficients of index not exceeding m, we can calculate 
from (3.7) the values of the v,¢,41) (r = 1,...,R — uw). Then all terms on the 
right of (3.7) except the first or Z term are known, and (3.7) can be regarded 
as a system of differential equations for the determination of the w, +1. 
Together with (3.8) as initial conditions, these equations are in normal form 
relative to the variety x = 0. Thus, by the Cauchy-Kowalewsky theorem, we 
conclude that the w,,.4,1) exist and are uniquely determined by a power series 
in x and x, convergent for sufficiently small values of these variables. This 
completes the step of the recursive construction, and shows that the series 
(3.5) can be formed term by term. 


4. Convergence of the series expansion. To complete this existence 
theorem it is necessary to show that the formal power series (3.5) converges 
for some interval of values of ¢. For this purpose we shall consider that the 
V+(n) (x, x») have been expanded in a multiple power series and that this has 
been substituted in (3.5). If a non-trivial domain of convergence for the 
resulting multiple series is established, this will show, by absolute convergence, 








oOo wey = -« 





MIXED PROBLEMS FOR LINEAR SYSTEMS 137 


the result desired. The convergence will be established by means of dominant 
series constructed from a similar but dominant differential problem. From 
(3.1), (3.2), (3.4) and the computations of the Cauchy-Kowalewsky solution 
of (3.7), for each value of m, all of which involve only additions and 
multiplications, it is clear that if we find dominant series for all coefficients 
on the right sides of (3.1), (3.2), (3.4) and for the initial values 9,.9), wy o), 
then the series solution so determined will dominate the original one. By 
subtracting v,,9) Or W,9) from each of the unknowns we can assume that the 
initial values are zero. 

All of the normal unknowns 2, of the first group shall be dominated by a 
single function V, and all those of the null group w, by a second function W. 
Let y = x'+...+ x"? and set 


(4.1) gnyeteS, 0<a<l, 
a a 


where a is left undetermined for the present. The dominant system consists of 
two differential equations, so chosen as to dominate the right sides of (3.1) and 
(3.2) respectively. For a suitable choice of M, M,, F, F; and p, these equations 
can be written as 








, OV M av , av, aw ow P| 
+2) Fe ee e+e ++ Se Vew+e 
and 

aw _ M, EE aw |. 
(43) “ox ~1—-@+yFD/oLay* ay + UTM T* 


Here a single dominant series is selected for all coefficients on the right side 
of (3.1) and (3.2) respectively. A separate choice of constants F and F, is 
made for the non-homogeneous terms which, after the reduction of initial 
values to zero, will depend upon the given initial data. If V and W dominate 
the v, and w, respectively, then the above right hand sides will dominate the 
corresponding members of (3.1) and (3.2). 

Dominant forms of the auxiliary conditions are 


(4.4) V>OdO, t= 0, 
for the initial conditions, and 


M;, 
(4.5) W>7 -eés00 [V + Fi], x= 0, 
for boundary conditions. It is the direction of the dominating relations which 
is significant here. 
Since we are free to increase any of the constants in the dominating series, 
we may suppose M;, so large that 
(4.6) M, > 2MM:2. 


We now assume, for the purpose of finding a convergent solution of this 
system, that V and W are functions of the single variable z only. By writing 








138 G. F. D. DUFF 


1 — 2/a*p in place of 1 — (x + y + #)/p in certain denominators we do not 
decrease any coefficients in the series expansions, and so maintain the necessary 
domination. Thus we find (derivatives with respect to z being denoted by 
primes) 


(4.7) 4V’ L., [(b+1) ++i) w4v4w4 rl 


— z/a’p 
and 
] , M, , , -y 
(4.8) —-W = — 7 ([V'4+W’'4+V4+W + FE]. 
a 1 — z/ap 


To this system we adjoin the boundary condition 


= 
1 oo s/t + F,) , 
to replace (4.5). If (4.9) holds, and if also 7 > 0 (that is, V has positive co- 
efficients) then the relation (4.5) will be maintained for x = 0. 

Rearranging terms in (4.7) and (4.8), which are ordinary differential equa- 
tions with variable z, we find 


(4.10) [2 - — M(l+ a) |v" — (1+a)MW’ = MolV+W+ F] 


(4.9) wW> 





and 
(4.11) E -+- us, \w" — MV’=M,V+W+ Fi. 
a ap 
On the right side of (4.11) we have introduced a new constant M, > M, to 
replace M;, in that position: this is a permissible alteration. 

We have to show that the system has a convergent series solution with 
positive coefficients such that (4.9) holds, and we will be able to assign initial 
values U(0) and V(0) at will. The choice of a is still open. Let us begin by 
showing that the coefficients in any such series are positive, provided that 
U(0) and V(0) are positive. Set 


(4.12) Ve= > as, We=- > ws’. 


Then (4.10) and (4.11) yield the recursion formulae 


 . wa + a))(n + 1)t—41 — M(1 + a) (m + 1) p41 


Fe Pten 
2 


(4.13) ™ 
= (+ on Ma), + Maw, + Foon, 
and 
— Mi(n + 1)%41 + (2 - M,)(n + 1)Wa+1 
(4.14) 


= Mw, + (4 + Mt, + Fybon. 








ta 





MIXED PROBLEMS FOR LINEAR SYSTEMS 139 


Here 6,, indicates the value one for m = 0, and zero otherwise. These equa- 
tions have the form, after division by n +1, 


AQn+1 + Bwy+1 = F,, 
4.15 
(4.15) Cones + Drags = Gp, 


where F, > 0, G, > 0 if we suppose u,, v, both positive. Now for sufficiently 
small positive a, 


dat B4sS4 . Be -~- R40 <4 
(4.16) < ; 

C=-—-M,<0 ’ D=~-—M,>°0. 
The solutions of (4.15) are given by 
(4.17) (AD — BC)tn41 = F,D — G,B > 0, 


(AD — BC)wa41 = G,A — FC > 0. 


Thus “4,41 and v,,,; will be positive provided that the determinant 
AD — BC = " [1 —a(1 + a)M — aM] 
> 4 [1 — a(2M + M;)] 


is positive. This condition, as well as (4.16), can be achieved if we set 


(4.18) ms : 
Now the boundary condition (4.9) will hold if 


(4.19) (1 - = = W > M.[V + F;], 0 < pz <p, 
2 


and we will show that this relation follows from (4.11) provided only that 
(4.20) W(0) > M2V(O) + F,, 


a condition which is clearly necessary in any case. Dividing (4.11) by a certain 
constant, we have 


ia Zz —- —_ aM, yr 

aan (: a’ p(1 — ai) —_ 
= MV + F,|>0. 

Simplifying the coefficient of V’ by means of (4.18), we find 


raf _——— ee M, a. 
(4.22) (1 _ Vn OM V M,W > 0. 








Now the derivative of (4.19) is 


(4.23) (1 = +-)w" ~ M.v'-+-wyo, 
a& po a po 











140 G. F. D. DUFF 


and we wish to show that by proper adjustment of constants this will follow 
from (4.22). We can choose for p2 any value less than p, and the dominance 
of the boundary condition will persist. Thus let us choose 


+ l 
Pe: = p(l = aM;), M, > “jy. 
a pe 


Then, in view of (4.6), and the fact that V and W are series with positive 
coefficients, (4.22) will imply (4.23). Assuming now that (4.23) holds, as well 
as (4.20), we see that (4.19) will be valid. Multiplying each side of (4.19) by 
the series with positive coefficients which is the reciprocal of 1 — 2/a’p2, we 
find (4.9). This establishes the dominance of the boundary conditions. 

The pair of linear ordinary first order differential equations (4.10) and (4.11) 
have the origin z = 0 as an ordinary point in view of (4.18). Consequently 
there exists a convergent series solution satisfying (4.20), for example with 
V(O) = 1, WO) = 2M; + 2F:, and the radius of convergence of these series 
is determined by the singular points of (4.10) and (4.11). Thus this radius 
depends on M, M,, M; and p, but not on F, F; or F:. By the substitution (4.1), 
dominant multiple series having a positive radius of convergence (independent 
of the initial or boundary data) are found for the series solutions of the original 
problem. This completes the proof that the latter series converge for sufficiently 
small values of the coordinate variables. 

The origin of coordinates can be chosen at will on the edge C and by analytic 
continuation a solution will exist in a region containing any given compact 
portion of C. If uniform hypotheses regarding the coefficients of the original 
problem are made, this local solution can be extended to large intervals of the 
x and ¢ coordinates by analytic continuation as is usual for analytic linear 
differential equations. 

To sum up, we have 


THEOREM I. Let G: (x*) = 0 be a characteristic surface of multiplicity u 
relative to the analytic linear system 


(1.2) At a + Bu =f, 


of R first order equations. Let T : ¥(x*) = 0 intersect G in an edge C such that, 
as in (2.11), 


(e) « 9D @) 
Ym = Omn ax! Zn ~ 0. 


Then there exists a unique analytic solution which satisfies R — pu initial condi- 


tions of type (2.20), (2.21) on G and yu boundary conditions of type (2.22), (2.26) 
on T. 


In order to bring this result into relation with a mixed problem, let us note 
that if the non-homogeneous terms in (3.2) and (3.3) are zero on G, and if the 
data g, in (3.4) vanish on C, then the values w,(0) of all components of the 











MIXED PROBLEMS FOR LINEAR SSTEMS 141 


solution on G will be zero. Now let Cauchy data (values of all components) be 
assigned on a non-characteristic surface S, and let T be a second non-character- 
istic surface meeting S in an edge C. Let G, a characteristic surface of multi- 
plicity u, pass through C and divide into two regions Rs and R, one of the four 
regions defined by S and T. Suppose that the necessary condition (2.11) is satis- 
fied, and let us determine a solution analytic in Rs and in Rr, and continuous 
across G, which takes the Cauchy values on S and satisfies boundary conditions 
of type (2.22), (2.26) on T. Subtracting away the Cauchy-Kowalewsky solu- 
tion of the initial value problem on S, we are left with a homogeneous system 
and homogeneous auxiliary conditions on G. By the remark above, we can 
define a solution analytic in Ry, and vanishing on G, provided that the func- 
tions g, in (3.4) vanish on C. Thus a piecewise analytic solution is found for the 
mixed problem by adding (in Rr) this solution of the characteristic problem. 
The restriction on the data g, is a compatibility condition of the first order. 
This construction includes as special cases the mixed problems for linear 
second order equations treated in (3, 8), but we shall not state it as a separate 
theorem since a more general problem is treated below. 


5. Case of non-simple elementary divisors. If the basic condition (2.11), 
which permits reduction of the differential equations to the standard forms 
(3.1) and (3.2), is not satisfied, a somewhat different proof is required. It turns 
out that the theorem still holds in very much the same form, but that the solu- 
tion on the characteristic surface is affected by the boundary data on the whole 
surface 7, not just the edge C. This has the consequence that the result is not 
directly applicable to any mixed problem. 

The earlier calculations and reductions have all been made essentially as if 
the number of independent variables were two; this is an advantage of the 
analytic case made possible by the generality of the Cauchy-Kowalewsky 
theorem. We shall now employ the general standard form for a system of 
first order equations in two variables, as presented by Petrowsky (14, p. 54) 
for example, where the Jordan normal form of A” relative to A*~' is used 
(17, p. 137). Let G : (x) = t = 0 bea characteristic surface of multiplicity 
and let T : ¥(x‘) = x = 0 by a non-characteristic boundary meeting G in the 
edge C as before. We select the two variables ¢ and x and perform the reduction 
to canonical form with respect to them. Since T is non-characteristic, A*~' is 
non-singular, and can be brought to unit matrix form. If then A” is reduced to 
Jordan normal form by linear transformations of the u,, the process is com- 
pleted. That G is a characteristic surface of multiplicity » signifies that » of the 
characteristic roots of the coefficient matrix A* relative to A*~' are zero. With 
this simplification we can write the first » of the equations in the form 


dws 





(6 1) ax - Li(v,, Ws), 
2 Ow, _ ow,—1 ;' =" 
ax = Ar-1 at + L, (0; wW,) 7s 2,3, s+ey Me 











142 G. F. D. DUFF 


Here the v, and w, denote appropriate linear combinations of the original 
dependent variables, while the coefficients a,; and differential operators 
L, depend on the coordinates ¢, x, and x,(p = 1,..., N — 2). The L, contain 
only differentiations with respect to the x,. 

The second set of equations will contain derivatives with respect to ¢ and x 
of the v, only. Since all other characteristic roots of A* differ from zero, we can 
write these equations in a form solved for the derivatives with respect to ¢. Thus 


ov ov 
= = Bi os + Ly+1(%n; Ws), 
(5.2) 
ov, oa ov, 05-1 . . 
at a Bs ax + Ys—-1 ax + Ly+s(Ua, Ws)- 


The operators L,,, are again independent of 0/dx and @/ dt. 

We remark that the difficulties of this particular problem arise from the 
presence of the coefficients a,_; and y,_:, which appear in the canonical forms 
of the original coefficient matrices because of certain non-simple elementary 
divisors (17, p. 137). The variables v, in (5.2) may again be termed normal 
with respect to the characteristic surface G. We shall again refer to the w, as 
the null variables proper to the characteristic value \ = 0, that is, to the 
characteristic surface G. Indeed, the reduction to canonical form shows that 
these null variables are obtained from the u, by contraction with a suitable 
characteristic, or proper, vector of the coefficient matrix A” (14, pp. 54-58). 

For simplicity we assign auxiliary conditions of a more restricted type: 


values of the w,(r = 1,...,) on T and values of the »,(s = 1,...,R — pz) 
on G. 
Power series expansions in the two variables ¢ and x are required: thus 
r = My mn ‘re 
(5.3) v } Ur(m.n)X 


w,= p Weim.n)X t. 


Inserting these series developments in (5.1) and (5.2) and equating coefficients 
of like powers of x and #, we find recursion formulae 


(5.4) (m + 1)Wr¢m+1,0) 


(m + 1)W,—1¢,.n4+1)%r—100,0) + 
«+» + Leo, (Pec x) Ween) 
and 


(n + 1) 0 s¢mn+1) = (m + 1) 0 5(+1,n)8 s(0,0) a 
(5.5) + (m + 1)0,~1¢m+1.0)Y2-100,0) +--- 
+ Ly+5¢0,0)(Pe(mnds Wiima)) +... 


Here all terms omitted contain coefficients of powers of x less than m+ 1 
and powers of ¢ less than m + 1. Also we shall understand that a_, = 0 and 


y-1 = 0, so that the first equation of each set (5.1), (5.2) need not be written 
separately. 


The boundary conditions determine the coefficients 0,;m.9) and W5c0.m)- 








—_- me eras «= & ts © FP 








MIXED PROBLEMS FOR LINEAR SYSTEMS 143 


To determine the coefficients recursively, let us suppose that all v,;m,.) and 
Wrimn) With m+n <k are known, and let us determine those for which 
m+n =k. From the boundary conditions w,,9,) is known; from (5.4) we 
may obtain in succession wW,,1,.—-1), Wr(2,.n—2), - - - Wr(3.e—3)) - - » » Weee.o- Likewise, 
we have v,x,0) from the boundary conditions, and from (5.5) we find succes- 
sively V5(x—-1,1), Us(e—2,2)) + - + » Usco.x), for s = 1,..., R — w in this order at each 
step. This completes the proof of induction on the recursive construction of 
the coefficients. 

The formal power series so formed is clearly unique and to complete the 
solution we must show that it converges. This will be done in the next section. 
However we remark here that even if all non-homogeneous terms are zero 
except the v,,0,.) for m > 0, the values of the w,,»,0), and so the values of the 
w, on G, will not be zero in general. The coefficients of the a,_; carry these 
non-zero W,@,,) through the steps of the recursion. Thus, values on G are 
affected by those on JT when non-simple divisors are present. 


6. Convergence of the double series. The recursion formulae are so 
arranged that if the right sides of the differential equations are dominated by 
certain series, then the coefficients calculated from the corresponding recursion 
formulae will be increased. By an easy preliminary transformation we can 
reduce the auxiliary conditions to homogeneous ones, and we therefore assume 
that the v,(m,0) and w,@,,) of the dominating problem are zero. 

Let the dependent variables of the dominant problem which we shall set up 
be denoted by capitals V,, W,. It is necessary to distinguish the terms 
containing each of the V, on the right side of each equation, and we denote by 
L,.(Vm) an operator with coefficients majorizing those in (5.1) or (5.2), and 
containing only first derivatives of V,,. Terms containing no derivatives will 
be expressed by majorizing operators M,(V,W), while L,,,(V) shall denote a 
similar expression containing first derivatives of the V,(s = 1,...,R — uw) 
with respect to the x*(p = 1,2,..., N — 2). We denote by A,_,, B, and T,-1 
functions whose series expansions dominate those of a,_;, 8, and 5-1, respec- 


tively. With these preliminaries, we can write the dominant equations in the 
form 


aw > Lin(Wn) + L(V) + MV, W), 


Ox _ 


OW, _  , OWys , : . 
(1) a "4" * p> Lim(Wm) + L(V) + M,(V, W), 


mn By 2 +. > Lyttm(Vn) + Lpa(W) + Murs(V, W), 


OV 5-1 
oo ome -i— a + p> Doten( Va) + L,.,(W) 











+ M,..(V. ’ 











144 G. F. D. DUFF 


Solutions which are functions of a single combination t/a + x/8 + 2,x* of 
the independent variables will be sought. However, in order to avoid the 
vicious circle which results from application of the usual reduction technique 
to this system, it is necessary to introduce changes of scale of the dependent 
variables v, as well as of the independent variables ¢, x and x’. Let us denote 
new variables, both dependent and independent, by a bar, and replace 


(6.2) W,, x, and x? 
by E 
(6.3) a,W,, lz, 62, 


respectively, where a,, / and 6 are w + 2 undetermined constants. In terms 
of the new quantities we find that (6.1) becomes 


OW; _ it OmLim (Wm) + Luv) | + * M,(V, W), 











Ox 6a, m 
oW, _ tans OWr1, 1 | , ow? ; | 
= A y-1 ms at + a > Omlirm(Wm) + L(V) 
Smee 
+ = M(V, 17), 


(6.4) 9V; _ By dV; 1) ‘ ” )] 
at - l dz 6 > Ombip+i.m(Vb =) + Lyiil(V) 
+ M,.:(V, W), 
OV, _ By AV, , Ter 9Ve1 uf he lie ke n| 
a 1 ae tae tT gL Ombetem(Wm) + L(V) 
+ M,.,(V, W). 


Here the indices r, m range from 1 to u while s ranges from 1 to R — ug. 

The undetermined constants will now be chosen so that every term on the 
right side of these equations which contains a derivative will have a factor, not 
larger in magnitude than a given e, multiplying it. Thus the following combina- 
tions must all be made no larger than e: 

(6.5) + ta, lars lam 1 1 ae 1 
a;6 a6 a; ba, ba, l 6 6 
Of these all but the third and sixth contain the factor 1/6. The sixth shows 
that / > 1/e, and the third, that the a, must increase with r as a geometric 
series of ratio 1/e*. These conditions are all satisfied if we choose 
1 1 
¢ 


€ 








(6.6) [= 
Now let a dominant series 


nn M>1,9= Fis 
be chosen for the sum of all coefficients on the right side of (6.4), assuming that 
the above special factors are not included in these coefficients. Then it is seen 




















MIXED PROBLEMS FOR LINEAR SYSTEMS 145 


that the first group of (6.4) are in turn dominated by a single equation for new 
variables W and V; 


67) @____M Ee aw . av, W+V+ El. 








ax 1—(¢+2+9)/pl° a “Sa "fa 
and that the second group are likewise dominated by a corresponding equation 
for V: 
av _ M av, aw av W+V a4 

CS) “a “I- attranle — =" 2 
The factor e**! in the last term of this equation is inserted for convenience, on 
the assumption « < 1. 

Let us show that these equations have a convergent series solution with 
positive coefficients. Set z = ¢ + + 9, and suppose W and V depend only 


on z. If derivatives with respect to z are indicated by a prime, we have, after 
rearranging, 





1— - - Dea)?" —«MV' = seal + V+ Fi, 


(1 — 2M) v’ — MW’ 
p 


respectively. Now let Y(z) be a solution of 


and 


iw + V+ Fi, 


(6.9) (1 — 3M — ‘) Yy’ = enl2Y + FI, 


and take W = V = Y. In order that W should have positive coefficients, we 
shall choose 


(6.10) oa a: 


Then 
432M*[2Y + F] 


dias 1 — 22/p 


an equation in which the right side, after expansion, has positive coefficients. 
Thus if W(0) = 1, a solution with positive coefficients and radius of conver- 
gence $p, independently of F, is secured. 

Retracing the steps of this reduction, we conclude that (6.1) has a convergent 
solution set with positive coefficients, and hence that the formal expansions 
(5.3) converge. This completes the local existence proof for the problem 
formulated in §5. The domain can be extended as in other linear problems. 


THEOREM II. Let G : ¢(x‘) = 0 be a characteristic surface of multiplicity p 
for the system 


(1.2) A‘) +Bu =f, 














146 G. F. D. DUFF 


and let T : ~(x*) = 0 be non-characteristic. Then there exists a unique analytic 
solution of (1.2) with assigned values on T for the u null variables w, relative to G, 
and assigned values on G for the remainder. 


From the recursion formula (5.4) it is evident that the values of the proper 
variables w, on G depend on the data given on T. The values of the coefficients 
W,o.n) enter the solutions of the difference equations (5.4) by means of the 
terms with coefficient a,1(m,.) of which the first is indicated explicitly. 

On the other hand, if the elementary divisors relative to the eigenvalue 
\ = 0 are all simple, then (5.1) has the form (3.2) since all coefficients a,_; 
vanish. We have shown that in this case the values of the solution on G depend 
only on the values of the data assigned on G. 


7. The general mixed problem. Let an initial surface S: o(x*) = 0, and 
a boundary surface T : ¥(x*‘) = 0, both non-characteristic, meet in an edge C. 
There will in general be a number of characteristic surfaces of (1.1) which pass 
through C, as the characteristic equation (1.8) has degree R. For the present 
we assume that each has multiplicity one. We select as domain D one of the 
four ‘‘quadrants”’ defined by S and 7, and choose any ko(1 < ko < R) of these 
characteristic surfaces G,(t = 1,..., 0) which lie in that quadrant D. A 
solution of the differential equations is sought in D, which is analytic except 
on G; (i < ko) and continuous there, which takes given Cauchy data on S, and 
satisfies ky suitable boundary conditions on 7. 

An analytic solution taking the Cauchy values on S can be constructed by 
the Cauchy-Kowalewsky theorem. Supposing this done, we subtract away 
this solution and so have a reduced problem with zero Cauchy data and homo- 











Fig. 1 











wn &® —™ 2 





—-  — 











MIXED PROBLEMS FOR LINEAR SYSTEMS 147 


geneous differential equations. The selected characteristic surfaces G,, which 
we shall suppose do not intersect except on C, divide D into ky + 1 domains 
D, (4 = 0,1,..., Ro) such that D, lies between G, and G,,4; as in Fig. 1. In 
each of these domains we shall construct a power series solution u,<, which 
shall be defined in every D,, j > i. The final solution will take the form 


h 
u, = } 2 Ura) 


i=1 
in D,, and so will be analytic except on the G,. 

It is a well-known property of hyperbolic equations that discontinuities of 
derivatives of solutions are confined to characteristic surfaces. Indeed the 
magnitudes of transverse discontinuities of this type satisfy ordinary differen- 
tial equations along the bicharacteristics. Our series expansions will be deter- 
mined in the light of these facts. In the reduced problem the only non-homo- 
geneous terms are the boundary data and these may be said to generate the 
whole solution of the reduced problem. Since the Cauchy solution is zero 
in Do the solution is, so to speak, built up step by step in the D, from the power 
series in ¢t — t,(x,x*) determined by the discontinuities of higher order en- 
countered in traversing the G, (j < 14). 

With S:¢(x‘) =t=0 and 7 :y¥(x‘) =x = 0, as before, we may write 
the homogeneous system in canonical form: 


= Ou, Ou, 
(7.1) tia r- Ox 
Here all elementary divisors are simple, by hypothesis, and the derivatives 
with respect to ¢ and x of u, appear only in the rth equation. We take A, # A, 
(r * s) for the present and note that the A, need not be real. Those variables 
u, which are null with respect to one of the k» selected characteristic surfaces 
G, appear in the differential equation with eigenvalue A,. To maintain our 
previous notations we distinguish the selected null variables u, by the symbol 
w, (ry = 1,..., Ro). The remaining R — ko variables are denoted by », (r = 1, 
.,R — ko). Thus the differential system appears as: 





+ L,(u,), Pm |, occ 


Ses ey, St + fo, w,) r=1,...,ko, 
ot Ox 
(7.2) an an 
9 Gy 1 Lams Wm) s=1,...,R— ko. 


Here the L, contain the transverse derivatives 0/dx, only. 
The k) boundary conditions shall take the form 


(7.3) w,= } Cre Vp + Qr pe h....ci 
These are linear conditions solved for the proper or null variables (which are 


in this instance the same). The datum functions g,(t,x,) are real analytic on 7, 
and since our solution is to be continuous we postulate 


(7.4) g(0,x,) = 0. 














148 G. F. D. DUFF 


8. The discontinuity expansion. We shall calculate the discontinuities 
across G, of the successive derivatives with respect to ¢ of each of the unknowns 
v,, w,. At each stage we must consider the jump of each component across each 
of the selected characteristic surfaces. It turns out that most of these quantities 
can be calculated directly, but that at each stage there are ky) which must be 
found by solving a differential equation on each of the G,. 

Let a discontinuity across G, of the mth time derivative of a function u be 
denoted by 


(8.1) (u™),. 


We define parameters s; on G;, measured from the edge C = S(\ T, such that 
p ee J "ee 

(8.2) "= he roe a wen ko. 

Since all discontinuities to be considered are finite, and analytic along the 

G,, the total discontinuity (u)) taken across C of a function defined on T is the 

sum of the limits of its jumps across the G,: 


ko 


(8.3) (u)o= >> (u)« 


i=1 | sg=0 


Replacing derivatives with respect to x by derivatives with respect to s, 
(i being fixed), by (8.2), we have 


Owe yy Om ‘ 
(8.4) (A, —_— d,) ry = A, as, a rA,L, r= 3 owes Ro, 
and 

Ov, _  _ », = - 
(8.5) A, = d,) Ot = A; as, _ AL, s= 1, eeey R Ro. 


Let us suppose that only g, in (7.3) does not vanish; there is no loss of generality 
as the equations are linear. The coefficients of g,, expanded in a series of powers 
of t, will be denoted by gi»), and we shall assume, for the exposition, that 
gZ«1) # 0. Then we calculate the first order jumps (w,"),, (v, “), as follows. 
First take r ~ 7 in (8.4), and take the discontinuity across G,. We get 


(8.6) (A, — As) (wi), = 0 r #4, 
since the other terms are continuous by hypothesis. Similarly 
(8.7) (A, — (oo) = 0. 


Thus all first order jumps vanish except possibly (w,“?) ;. To find this quantity, 
we differentiate (8.47) with respect to ¢ and take the jump across G,. Since the 
above left side is zero we get 
aL, ) 
( ot (v, w) P 


: (w!”), 
a. «@ 
ax (we) + bee(we’)s. 


(8.8) me 


op 
= di 











—- &@ (PF 














MIXED PROBLEMS FOR LINEAR SYSTEMS 149 


Here the appropriate coefficients in L,; have been exhibited. All other terms, 
being continuous, drop out when the jump operator is applied. (No summation 
over repeated Latin indices is intended.) This equation has the Cauchy- 
Kowalewsky normal form with respect to the edge C, in the variables s, and 
x? (9 = 1,...,N — 2), since C has the equation s, = 0 in the surface G,. 
An initial condition on C for (8.8) is now to be found. Using (8.3), and differen- 
tiating (7.3) with respect to ¢ and taking jumps, we have 


(wi) a (wt )o — 2, (wi), o 
= (wi")o 
_— a p Cua(0s”)0 + gu 
= ms ‘ 


With this initial condition the single partial differential equation (8.8) has a 
unique solution on G,. This completes the calculation of the first-order jumps 
and it may be noted that the non-homogeneous term g, induces a first order 
contribution only from the corresponding proper variable w, over the corres- 
ponding surface G,. 

If the first non-zero term in g, is of a higher order nm, the only non-zero nth 
order discontinuity is of the same kind as that just mentioned. 

The discontinuities of higher orders are found in succession by this process. 
Suppose known all jumps of order m — 1 or less, and let us find those of order 
n. Differentiating (8.4) and (8.5) m — 1 times with respect to ¢, and taking 
jumps over G,, we have 
9 
Os 


_—y 2 pe te—v (= ) 
de Bs, ya HA ap Ls Teves 


n in— a” 
(Ar — Ay) (we), = — \,>=— (w, »,4{ 21,) +..., 
é t 


om (A, — hs) (ve): 


Now the right hand sides are all known in terms of the discontinuities of order 
<n — 1 already calculated. Again, provided r ~ 7 in the first group, we 
obtain the values of the (w,”), and (v,™), along G,. 

To find the remaining quantity (w,”), we differentiate (8.47) m times with 
respect to ¢ and then take the discontinuity across G,. The result is 


a” 
( ar L,(v, w) ) 


) a‘, = (wy), + > b.-(wy”) 


a 8 
as. (w' " i 


(8.11) 


re) ™ n 
. + : a's ax’ (vs), + 7 bis (0e”) 


where the terms omitted are of discontinuity order less than n. However all 
jumps present except that of w, are known and we obtain the non-homo- 
geneous differential equation 











150 G. F. D. DUFF 


_ 
Os; 


(n) a (mn) 


(8.12) (w; ds _ ais ax’ (w; i + bis(wy”), + K, 


where K stands for a known expression. The initial condition is now found from 
(7.3) by differentiating m times with respect to ¢ and taking jumps. Thus from 


(8.3) 
(8.13) (wi) - = (wi )o = s cw’), : 





’ 


and by Leibnitz’ formula used in connection with (7.3), 


(8.14) (wl). = > em | ~ Jeo. + £in- 


The right hand sides are known and the initial value determined. Since (8.12) 
is a non-homogeneous version of (8.8) the existence of an analytic solution on 
G, follows as in the first order case. This completes the calculations for the 
nth order. 

It may be noted that the interaction of the calculations for the various G, 
(¢ = 1,..., &o) is brought about by the terms in the sum on the right side of 
(8.13). The negative sign appearing there will have no special effect in the 
proof of convergence. 

The recursive construction being complete, both for the v, and the w,, we 
define the series of which the solution functions are composed. Let the ith 
characteristic surface G, have the (analytic) equation ¢ = ¢t,(x,x*). The series 
44 is now given by 


(8.15) Un = 7 (uy) « (t — te(x, x”))", 


where u, stands for any one of the variables v,, w,. Then, as indicated pre- 
viously, the final formal solution is 


rh 
(8.16) t= >, tne in Dy, h = 0,1,..., Ro. 
i=—1 ° 


To complete our existence proof we must show that these series have a common 
domain of convergence. 


9. Convergence of the discontinuity expansion. We will show that each 
of the series (8.15) is dominated by the solution of a certain problem wherein 
only one characteristic surface G appears, and one boundary condition is 
present. The solution of this simplified problem will follow from Theorem I. 
We shall find expressions which dominate the various terms (u,™) , by requiring 
that the coefficients on the right sides of all of the differential equations and 
recursive relations used in the construction of the solution should be simul- 
taneously majorized. Thus let G = G(t,x*) denote a series with positive terms 
which dominates every one of the datum functions g,(t, x*), on J, and vanishes 
for t = 0; and let K = K(t, x*) dominate all of the coefficients c,, of (7.3). 























MIXED PROBLEMS FOR LINEAR SYSTEMS 151 


The dominating series in the differential operators are constructed as follows: 
for r * i, divide (8.4) by A, — A,; and divide (8.5) by A, — A,. Then equations 
of the form 


(9.1) — = ¥,(v, w), as = S,(v, w) 


are formed, where -7, is a linear operator in the x* and in the tangential 
variable s,. Let every one of these operators be expanded in power series about 
each of the characteristic surfaces G,; that is, let the coefficients be written as 
power series in each of the ko sets of variables ¢ — t,, s,, and x*. We can now 
select series which dominate all of these series for every operator, and for every 
value of the index j, and for every value of the index i. The variables ¢ — ¢, and 
s shall be replaced by two common variables ¢ and s. Let us also attempt to 
majorize all expansions of the w, about G,(i # r) by a single series W, and all 
expansions of the v, by a single series V. To dominate the development of w, 
about G,, for each selected i, we take a third series Z. Thus we construct an 
operator L(V, W, Z) which will dominate the right sides of (9.1) provided that 
V, W, and Z dominate v,, w,(r # 1) and w,, respectively. 

Similarly we consider the single equation (8.47) for each i, and it has the 
form 


(9.2) —— = F,(v, w). 


Let L,(V, W, Z) dominate the right side of (9.2), for all 1, when its arguments 
dominate those of %. 

Now consider the system, already in canonical form in the variables ¢, s 
and x, 


aV 


_* L(V, W, Z), 
(9.3) oF = L(V, W, 2), 
a4 = L,(V, W, Z). 
By Theorem I, the appropriate auxiliary conditions include two for ¢ = 0, viz. 
(9.4) V=W=0, 
and one for s = 0, which we take as 
(9.5) Z = RK(t, x”)V + RW + Git, x’). 


This system satisfies the conditions of Theorem I and the existence of a 
convergent power series solution follows. 

We now show by induction on m that the coefficients V;.),W ) and Z,), in 
the expansion of this solution in powers of ¢, dominate the series for the dis- 
continuity terms of order n of w,(r # i), v, and w, across G,. The solution 
of (9.3) —(9.5), the existence of which is guaranteed by Theorem I, could itself 











152 G. F. D. DUFF 


be equally well regarded as a discontinuity expansion relative to the character- 
istic surface ¢ = 0. In general, therefore, the computations based upon it will 
lead to series dominating the original discontinuity expansion. 

To verify this in detail we begin with the first order jumps. Since L:(V, W, Z) 
has coefficient functions with positive coefficients in their expansions, we 
find for Z on G :t = 0 a series with positive coefficients. Thus Z) > 0. For 
V; and W, we also get expressions with positive coefficients in view of the 
choice of the operator L(V, W, Z); and these certainly dominate the (w,‘”), 
and (v,“), which are all zero. For Z, we have a differential equation found by 
differentiating the third of (9.3) with respect to ¢ and setting ¢ = 0; the 
operator on the right side of this equation certainly dominates that in (8.8). 
To complete the demonstration for the first order terms we must show the 
dominance of the initial value for Z; when s = 0, namely 


(9.6) Zi ae RK (0, x’) Vi + RK,(0, x°)V + RW, + Ci, 
s= (= 

as is seen by differentiating (9.5) with respect to ¢ and then setting ¢ = 0. A 
comparison with (8.9) leads to the desired conclusion since G «lominates g,. 
Therefore the dominance holds for first order terms. 

Proceeding by induction for higher orders, we shall assume that the 
dominance holds for all orders less than n. By the definition of the operator 
L(V, W, Z), and by comparison with (8.4), (8.5) and (8.10), we see that 


(9.7) (we), <K W, (r ¥ i) 
and 
(9.8) (vo), K Vp. 
In. the differential equation for Z,, namely 
0Z, _ d@ — 
(9.9) , a L,(V, W, Z) saat 


every coefficient of a derivative of Z,, and the coefficient of Z,, will dominate 
the corresponding terms in (8.11), and, moreover, the non-homogeneous terms 
(jumps of lower order) will each dominate the corresponding items on the 
right side of (8.11). Thus (9.9) dominates (8.12), in a formal sense. The 
Cauchy+Kowalewsky solution of (9.9) is so constructed that the dominance will 
then hold for the solutions if it holds for the initial conditions at s = 0. 

From (8.13) and (8.14) we find 


(9.10) (w),| = 35 Cem Omal™)o + Bmw — XL (w), 
0 j#l 


s;=0 s.m= sj=0 


where the a, are a set of positive numerical coefficients of the type of com- 
bination symbols. This is dominated by 


(9.11) R >) Rin Gun Va + Ge + RWa, 
m=0 








-“- O®Wd «2 














MIXED PROBLEMS FOR LINEAR SYSTEMS 153 


where the &,,, are symbols corresponding to the ¢;,, but obtained by differenti- 
ation from the dominant function K(t, x*). Here we have made use of (9.7) 
and (9.8), and have replaced the summation over s and over j by the factor 
R. However, by (9.5), the expression (9.11) is exactly the initial value which 
should be computed by the jump process, for the quantity Z,, at s = 0. 
Therefore 


(n) 


(wi )¢ < Z, 


si=0 


’ 
s=0 








a dominating relation in the variables x*, (op = 1,..., N — 2) which com- 
pletes the calculations for order m. Thus the induction is complete and one 
of the series U, W, Z dominates each of the series (8.15). 

As the series (8.15) converge for ¢ — ¢t, sufficiently small, and for suitably 
small values of the remaining variables x and x’, they will all converge for 
sufficiently small positive values of ¢ and x, since t,(x, x*) tends to zero with x. 
This establishes convergence of the solution in a neighbourhood of the origin, 
which has been chosen, in effect, at a typical point of the edge C. Extension of 
the domain is now possible by conventional methods, and will not be pursued 
here, though we remark that analytic continuation must be pursued separately 
for each sector domain D,(i = i,..., Ro). 

Before stating our result as a theorem, we make two minor extensions 
connected with the eigenvalues \,. First, it is permissible that the selected \, 
should be a multiple eigenvalue of multiplicity u, say, provided that the corres- 
ponding elementary divisors of A”, with respect to A*~', are simple. Then the 
Cauchy differential equations (8.8) and (8.11) become systems of order yu in u 
proper variables v,, but the formal structure of the calculations is not affected. 
The number of boundary conditions is then the sum of the multiplicities of the 
characteristic roots \,. 

Secondly, the non-select characteristic roots may have larger multiplicity 
without restriction on the elementary divisors. A comparison with Theorem | 
shows that the additional terms present with non-simple divisors do not 
disrupt the calculations. However, the variables proper to each eigenvalue 
must generally be treated in a fixed order at every stage. 


THEOREM III. Let non-characteristic surfaces S and T relative to the analytic 
system 


A‘ 2 + Bu =f 


intersect in an edge C from which issue into a quadrant at least ko distinct charac- 
teristic surfaces G,. Let the elementary divisors referring to the eigenvalues d, be 
simple. Then there exists'a solution, continuous in the quadrant and analytic 
except across the G,, which takes given Cauchy data on S, and for which the 
variables w, null with respect to the G, take values on T determined by linear 
boundary conditions. 











154 G. F. D. DUFF 


Let us remark, in conclusion, that the only restriction on the reality of the 
eigenvalues is that the select \, be real. They correspond to real characteristic 
surfaces. The remaining roots may be complex, and the theorem is thus applic- 
able to systems which are only partly “‘hyperbolic’’ in nature. This freedom of 
“type” should accompany a theorem based on the Cauchy-Kowalewsky 
theorem, which is quite independent of such restrictions. 


10. Uniqueness of the series solution. The expansions of Theorem III 
imply that the solutions are analytic not only in each sector domain D,, but 
also on the closure of D,. This is a stronger condition than that of being piece- 
wise analytic—for instance, e~'* is piecewise analytic but is not analytic for 
x = 0. We can therefore only assert, in general, that the series solution found 
above is unique in the class of vector functions u, having this strong piecewise 
analyticity. That it is unique in this class follows from the well-defined nature 
of the construction of the solution. 


St 


Se 











Fig. 2 


One case in which uniqueness of the solution in a wider class of real vector 
functions can be shown is the case when all roots are real and different from 
zero, and all positive roots are select. By a modification of Holmgren’s theorem 
(14, p. 34) we can prove uniqueness within the class of once continuously 
differentiable vector functions. It is sufficient to prove such a uniqueness 
theorem locally, and we therefore consider a region R defined as follows. Let S 
and 7 meet in C as before, and let S; be a surface nearly parallel to S, meeting 
S and T in an edge B, which intersects C, and such that S, T and S; enclose a 
region R which is a half of a lens-shaped region (Fig. 2). Let an analytic family 
S, of surfaces t = const. fill R in such fashion that S = S), and S; = Syn 
Let the real characteristic roots 8, of the matrix — a*—',, with respect to 
a” ,, not vanish, change sign or become complex as ¢ varies from 0 to 1. This will 
happen if S, is sufficiently near and parallel to Sp. 








—™> © Mm Oo or ft 


iz 





—— 





MIXED PROBLEMS FOR LINEAR SYSTEMS 155 


Then we may write 


(10.1) L,(u) = 28 +p, 8 + F ot, 28 + bn 
. rT 2 at r Ox A "8 Ox e rs % 
and we define the adjoint operator 
dv, , 9 +> a 
9 os om --- , -- a) —_ ) 
(10.2) M,(v) a + ax (B, v,) + — ax’ (a'r Us) Dar Vs. 


Consider the mixed problem for M,(v) = 0 with “‘initial’’ surface S,; and 
boundary 7, the solution being defined in R. The number of characteristic 
surfaces issuing from C; into R is equal to the number R — k of negative roots 
8,; and we may suppose that all are select. By Theorem III, we can construct 
a solution of the adjoint system with analytic “‘initial’’ values on S,, and satis- 
fying R — k suitable conditions on T. 

Now suppose that u, vanishes on S, and that the & select components 
u,(8, > 0) satisfy homogeneous boundary conditions 


(10.3) a, = >> Con te ihe aay 


Supposing that L,(u) = 0 and that the u, are C', we wish to show u, = Oin R. 
Let the values u, on S; be approximated by analytic values for v, such that 


(10.4) lup — Vy, <e on S}. 


Then let v, denote the piecewise analytic solution of M,(v) = 0 with these 
“initial’’ values on S,, which satisfies the R — k adjoint homogeneous boun- 
dary conditions 


k 
(10.5) Bu tn = — >. By Cn Vey n=k+1,...,R, 
s=1 


on 7. Applying Green’s formula, which is in this case 


(10.6) (v,L,(u) + u,M,(v))dV -f > -uvds + f > -Buw dS, 
YR 81—%9 ~”T 


we see that the volume integral on the left vanishes. The surface integral 
over T becomes 


. k R 
J (x B Uv, + pe Bano, dS 
T s=1 


n=k+1 
k R R 
(10.7) -{ (x DY Bint, + > Butta) aS 
T s= 1 n=k+1 n=k+1 
R k 
wat J, } ta( Bat + i B.cn0,)aS = (), 
n=k+1 s=l 


Thus, as the integral over Sp is zero since u, = 0 there, we find from (10.4) 


(10.8) f > uvds = J > udS + O(c) = 0. 











156 G. F. D. DUFF 


Letting « — 0 we see that the integral over S, is zero and hence that u, = 0 
on S;. It follows that u, = 0 in a region R sufficiently near So, and this proves 
the uniqueness theorem. 

We may note that unless the datum functions g;(x’, ¢) satisfy compatibility 
conditions of the first order with respect to the initial data, the analytic solu- 
tion of Theorem III may not be C' across the G,;. However by subtracting away 
the solution of an auxiliary problem in which the g,(x’, ¢) are linear functions 
of t, we can cause the discontinuities of first order to vanish, and for simplicity 
we shall suppose that this has been done. We recall that all elementary divisors 
are assumed to be simple. 


If all characteristic roots are real and different from zero, and all positive roots 
are select, the piecewise analytic solution of Theorem III is the only C' solution 
of the problem. 


An instance where uniqueness will hold in the stronger sense is that of sym- 
metric hyperbolic systems, which we now consider. The estimates to be found 
for these systems also imply uniqueness in the C' class. 


11. Symmetric hyperbolic systems. Theorem III may be used, in com- 
bination with estimates of the Friedrichs-Lewy type and Sobolev’s lemma, to 
establish a mixed initial and boundary value theorem for symmetric hyper- 
bolic systems having a finite order of differentiability. A system (1.1) is called 
symmetric hyperbolic (6) if the coefficient matrices are symmetric: a‘,, = a‘,,, 
and if there exists a covariant vector £,° such that £,°a‘,, is a positive definite 
matrix. We note that for a symmetric system all elementary divisors are simple. 

By a suitable transformation of coordinates we may suppose that a”,, is 
positive definite. The surface S :t = x” = const. is then said to be spacelike, 
and we assume that the initial surface, carrying Cauchy data, has this property. 
Let the boundary surface JT :x = x*-' = 0 meet S in the edge C of N — 2 
dimensions. We mark off on S an initial region Sp having as boundary part of 
C and also a variety B of N — 2 dimensions, which will be held fixed in the 
following calculations. Let S‘ be a spacelike surface which meets T in a locus 
C, such that the boundary dC, lies in B () C, as in Fig. 2. The surfaces C, of 
dimension N — 2 shall lie in 7, having ¢t-intercepts increasing with ¢ in an 
obvious sense, except that the boundaries 6C, are fixed. Thus the C, cover, for 
0 <t < ht, a lens-shaped portion of 7, having the base C. The t-intercepts of 
the family of spacelike surfaces S, shall also be increasing with ¢, except for the 
fixed portion B of 5S, The region of space covered by the S; is a half of a 
lens-shaped region. All these surfaces are assumed to have a certain degree of 
differentiability. 

Since A” ,, is positive definite we may write the system (1.1) in normal form 
relative to S,:¢ = const. and we may then apply the reduction to standard 
form, relative to x as second variable, given in (14, p. 53). The equations then 
take the form 

















(1 














MIXED PROBLEMS FOR LINEAR SYSTEMS 157 


(11.1) oe + 8, Ht + Llu) +f, = 0, Lt ee 


x 


Here L,(u) is a symmetric operator in the remaining variables, and the smooth- 
ness of the coefficients in (11.1) is unchanged by the transformation. 

Reduction to this canonical form might equally be attained by the simul- 
taneous reduction of the pair of quadratic forms 


(11.2) ie Ge tet 


to the standard forms (17, p. 148) 
(1 1,3) bps pls, Bb ysl pty. 


To each of the necessarily real characteristic roots 8, there corresponds a 
characteristic surface G, containing C. If 8, is positive this characteristic 
surface issues from C into the domain V wherein our solution is to be con- 
structed. Suppose that k of the R roots 8, are positive, and that the multiplicity 
of each root is constant in V. The remaining roots 6, are negative since the 
surfaces S, are not characteristic. 

The variables u, corresponding in (11.1) to the positive 6, are the null 
variables of the & characteristic surfaces G, lying in V. We assign & linear 
boundary conditions, expressible as 
(11.4) u, = g, Sk eee 


Here the data g, are assumed to have the same degree of differentiability as the 
differential equations and auxiliary surfaces. 

Thus R functions—the Cauchy data—are given on S, and k on 7. We seek 
a solution u, of (11.1) in V, which is continuously differentiable in V except 
across the G, where it need only be continuous. Since the Cauchy initial value 
problem can be regarded as solved (10) we subtract away the solution and so 
find zero Cauchy data for the reduced problem. Then the data g, in (11.4), to 
be compatible with the above conditions, must vanish to the first order on C. 
We shall suppose that they vanish, together with their derivatives of order 
</l—1l,onC. 


12. Estimates. We derive the Hilbert space estimates from a certain 
differential identity, which contains the essential property of a symmetric 
hyperbolic system (6). Let summation over all values of i, r, s, m be understood 
in the following equations. Writing the system as 


(12.1) L,(u) = at ott + bn Us +f; = 0, 
we have 

2u,L,(u) = 2a}.u, at + b,,u;t, + ft, 
(12.2) . 


_ 21 (a',tt-ts) + Irs Uy, + frttr. 











158 G. F. D. DUFF 
Integrating over the domain V,, we have, by the divergence formula, 
0 = f 2u,L,(u)dV, = f a},u,un dS 

Ve Si—Sot+T: 


+f ompar, 


where n, denotes the covariant surface normal, and where Q(u, f) is quadratic 
in the u, and linear in the f,. 
Now let us isolate the integral over S;: 


= u, dS = J ay, uu dS 
St 


(12.3) 


| Fall 
- -f Ou, Nav + f > u,dS +f ay, uudS. 
Ve So rf T: 
Here we have used my = 1, my_,1 = — 1. From (11.1) we find 
k R 
(12.5) e* uu, = =. Bu, + p> B,u,’; 
r=1 r=k+1 


and the boundary conditions have been chosen so that this form is bounded 
above. Indeed, in view of (11.2) and the negative values of all 8, for r > k, 
we have 


k 
(12.6) ay, Urs << Dd, Byte, 


r=1 


Let us denote by ||| 5,?, ||«||r,? the square integrals of 2,u,* over S, and T, 
respectively. Then 


(127) lui =f Dew tav < Ki f |full3ae, 


and from (12.4) we find, by conventional majorizations, 


(12:8) |Iul|3, <K full, + llellbl + Ks J |lulibdt + |Uslle 


By iteration of this inequality we obtain 
(12.9) lls Ise < KL |||’ + llellz. + IIfllvle™**, 


and upon integration with respect to /, 


eX 


K,° 
This is the Friedrichs-Lewy estimate for the components u,. 

By differentiation of the system we can show that all first derivatives, except 
those with respect to x, satisfy a similar system of first order equations and 
boundary conditions. Repeating the above argument, we can show that these 
derivatives satisfy similar estimates in which the derivatives of the data 
appear. From the system (12.1) we then find corresponding estimates for the 


(12.10) ||| |v. < KL lla] ls + llellre + Lived 











are 


Ww 








MIXED PROBLEMS FOR LINEAR SYSTEMS 159 


derivatives with respect to x. Estimates of all higher order derivatives can be 
found by repetition of this type of calculation; and we omit details. 

The existence of a solution is now established by a sequence of analytic 
approximations, based on Theorem III. Let u,(m) be the piecewise analytic 
solution of an approximating analytic problem in which all coefficients and 
functions together with their derivatives up to an order [}N] + A + 1 approxi- 
mate in the square integral norm the corresponding quantities of (11.1). Then 
the norms 

\|Dy: u,”| Iv. 
are uniformly bounded, where D, denotes a derivative of order 


h’ < NJ) +441. 
By Sobolev’s Lemma (15) the functions 
Dy u,™ 


are then uniformly bounded, and by Ascoli’s theorem (5, p. 122) we can select 
a subsequence which converges, together with all derivatives of order < h, to 
a limit u,. This limit is a solution of the non-analytic problem. For this result 
we shall assume that the given system, surfaces, and boundary data are of 
class C4]+*+1, and that the data g, of (11.4) satisfy on C compatibility 
conditions of order /. Then the approximations u,™ are of class C' in V, as is 
easily seen by examining the series expansions of Theorem III. Thus for /<h, 
the final solution u, is C” in V except across the characteristic surfaces issuing 
into V from C, where u, is C’. 

We remark that the number of boundary conditions is determined by the 
signature of the second quadratic form (11.2). 


THEOREM IV. A symmetric hyperbolic system (12.1) of differentiability class 
[$N] + 4+ 1 has a unique solution in a domain V bounded in part by a space- 
like initial surface S and a boundary surface T, which 

(a) assumes given Cauchy data on S, 

(b) satisfies k boundary conditions (11.4) on T, where k is the number of 
characteristic surfaces issuing from C = T (\ S into V, 

(c) is of class C" in V except across these characteristic surfaces where it is of 
class C'. 


Extension of the domain has been treated for similar problems in (3, 8, 10) 
and will not be pursued here. The above method of estimation will apply to 
boundary conditions of the form 

R 


Ur = gr te > Cre Us, 


s=k+1 
provided that |e| is sufficiently small. 


It is a pleasure to acknowledge the cooperation of Abraham Robinson 
which has been of the greatest assistance. To Professor K. O. Friedrichs | am 














160 G. F. D. DUFF 


indebted for an illuminating discussion of the symmetric hyperbolic systems. 
This work was largely carried out at the 1956 Summer Research Institute of the 
Canadian Mathematical Congress, and I wish to thank the Sloan Foundation 
for a fellowship held at that time. 


a 


14. 
15. 
16. 


17. 


REFERENCES 


G. Birkhoff and S. MacLane, A Survey of Modern Algebra (New York, 1941). 
. L. Campbell and A. Robinson, Mixed problems for hyperbolic partial differential equations, 
Proc. Lond. Math. Soc. (3), 18 (1955), 129-147. 


- 


. G. F. D. Duff, A mixed problem for normal hyperbolic linear partial differential equations 


of second order, Can. J. Math., 9 (1957), 141-160. 


. E. Goursat and E. R. Hedrick, Mathematical Analysis, Vol. 11, Part II (Boston, 1917). 


M. Graves, Theory of Functions of Real Variables (New York, 1946). 
. O. Friedrichs, Symmetric hyperbolic linear differential equations, Comm. Pure and App. 
Math., 7 (1954), 345-392. 
K. O. Friedrichs and H. Lewy, Ueber die Eindeutigkeit und die Abhangigkeitsgebiet der 
Losungen beim Anfangswertproblem linearer hyperbolischer Differentialgleichungen, Math. 
Ann., 28 (1927), 192-204. 


. M. Kryzyanski and J. Schauder, Quasi lineare Differentialgleichungen zweiter Ordnung vom 


hyperbolischen Typus, Gemischte Randwertaufgaben, Studia Math., 6 (1936), 152-189. 


. O. Ladyzhenskaya, Mixed Problems for Hyperbolic Equations (Moscow, 1953). 
. J. Leray, Hyperbolic Differential Equations (Princeton, 1953). 
. J. L. Lions, Problémes aux limites en théorie des distributions, Acta Math., 94 (1955), 13- 


153. 


. J. L. Lions, Opérateurs de Delsarte et problémes mixtes, Bull. Soc. Math., 84 (1956), 9-95. 
. J. L. Lions, Quelques applications d’ opérateurs de transmutation, Proc. Colloque internation- 


ale du C.N.R.S., 71 (1956), 125-137. 

I. G. Petrowsky, Lectures on partial differential equations (trans.) (New York, 1954). 

S. Sobolev, Doklady, 10 (1936), 277-282. 

J. M. Thomas, Riquier’s existence theorems, Ann. Math. (2), 30 (1929), 285-310; 35 
(1934), 306-311. 

B. L. van der Waerden, Moderne Algebra, Vol. II (Berlin, 1931). 


University of Toronto 














LS 




















